What are the responsibilities and job description for the Big data Engineer position at ADPMN Inc?
Job Title ::Big data Engineer
Location :: Parsippany NJ
Duration :: Long Term
Responsibility
Helping the product owner and development team to achieve project outcomes. Build stories and prioritize as per the requirement.
Analyze data platform requirements and design and document solutions.
Build key infrastructure, frameworks and applications to support the needs of Data Engineers, Data Scientists and Business.
Improve SDLC processes with the team of engineers and DevOps.
Build effective/efficient and reusable data pipeline frameworks for various data source types, refresh patterns and transformations.
Build data pipelines, jobs using Spark and Databricks to ingest into Data Lake/Delta Lake on AWS
Support teams using the Data and AI/ML platform, design and validate their use case architecure, provide best practices for solutions, troubleshoot development issues with platform feature and frameworks.
Partner with team members, product owner and other stakeholder to ideate, review and align on data validation approach.
Maintain end-end data security (both at rest, transit) and data sharing mechanisms.
Identify and build automation processes to support various data platform scenarios.
Design and build Dev/Data/MLOps processes using cloud services.
Qualification
Strong technical skills and experience working with and supporting multiple engineering teams.
Experience in building Big Data/ML/AI applications and optimizing data pipelines, architectures and data sets.
12 years of technical experience with big data technologies. They should also have experience using the following software/tools:
Experience with big data tools: Apache Spark, Databricks, Parquet/Delta, PySpark, SparkSQL, Spark Streaming, Kafka/Kinesis, S3, Glue
Experience with Databricks using Unity Catalog and MLFlow is a big plus
Experience with any relational/noSQL databases or any MPP databases like Snowflake and Redshift.
Experience with data pipeline and workflow management tools: Step Functions, Airflow
Experience with AWS cloud services: EC2, EMR, RDS, Redshift, IAM, Security Group, VPC etc.
Experience developing in Python and Notebooks/IDE
Experience with Automation: Jenkins CI/CD, Terraform, CDK, Boto3
Location :: Parsippany NJ
Duration :: Long Term
Responsibility
Helping the product owner and development team to achieve project outcomes. Build stories and prioritize as per the requirement.
Analyze data platform requirements and design and document solutions.
Build key infrastructure, frameworks and applications to support the needs of Data Engineers, Data Scientists and Business.
Improve SDLC processes with the team of engineers and DevOps.
Build effective/efficient and reusable data pipeline frameworks for various data source types, refresh patterns and transformations.
Build data pipelines, jobs using Spark and Databricks to ingest into Data Lake/Delta Lake on AWS
Support teams using the Data and AI/ML platform, design and validate their use case architecure, provide best practices for solutions, troubleshoot development issues with platform feature and frameworks.
Partner with team members, product owner and other stakeholder to ideate, review and align on data validation approach.
Maintain end-end data security (both at rest, transit) and data sharing mechanisms.
Identify and build automation processes to support various data platform scenarios.
Design and build Dev/Data/MLOps processes using cloud services.
Qualification
Strong technical skills and experience working with and supporting multiple engineering teams.
Experience in building Big Data/ML/AI applications and optimizing data pipelines, architectures and data sets.
12 years of technical experience with big data technologies. They should also have experience using the following software/tools:
Experience with big data tools: Apache Spark, Databricks, Parquet/Delta, PySpark, SparkSQL, Spark Streaming, Kafka/Kinesis, S3, Glue
Experience with Databricks using Unity Catalog and MLFlow is a big plus
Experience with any relational/noSQL databases or any MPP databases like Snowflake and Redshift.
Experience with data pipeline and workflow management tools: Step Functions, Airflow
Experience with AWS cloud services: EC2, EMR, RDS, Redshift, IAM, Security Group, VPC etc.
Experience developing in Python and Notebooks/IDE
Experience with Automation: Jenkins CI/CD, Terraform, CDK, Boto3