What are the responsibilities and job description for the Senior Data Engineer position at Techgene Solutions LLC?
Job Details
Position: Senior Data Engineer
Location: 99 Jefferson Rd, Parsippany, NJ 07054 (Onsite)
Duration: 6 Months Contrac
Responsibility:
- Helping the product owner and development team to achieve project outcomes. Build stories and prioritize as per the requirement.
- Analyze data platform requirements and design and document solutions.
- Build key infrastructure, frameworks and applications to support the needs of Data Engineers, Data Scientists and Business.
- Improve SDLC processes with the team of engineers and DevOps.
- Build effective/efficient and reusable data pipeline frameworks for various data source types, refresh patterns and transformations.
- Build data pipelines, jobs using Spark and Databricks to ingest into Data Lake/Delta Lake on AWS
- Support teams using the Data and AI/ML platform, design and validate their use case architecure, provide best practices for solutions, troubleshoot development issues with platform feature and frameworks.
- Partner with team members, product owner and other stakeholder to ideate, review and align on data validation approach.
- Maintain end-end data security (both at rest, transit) and data sharing mechanisms.
- Identify and build automation processes to support various data platform scenarios.
- Design and build Dev/Data/MLOps processes using cloud services.
Qualification:
- Strong technical skills and experience working with and supporting multiple engineering teams.
- Experience in building Big Data/ML/AI applications and optimizing data pipelines, architectures and data sets.
- 5 years of technical experience with big data technologies. They should also have experience using the following software/tools:
- Experience with big data tools: Apache Spark, Databricks, Parquet/Delta, PySpark, SparkSQL, Spark Streaming, Kafka/Kinesis, S3, Glue
- Experience with Databricks using Unity Catalog and MLFlow is a big plus
- Experience with any relational/noSQL databases or any MPP databases like Snowflake and Redshift.
- Experience with data pipeline and workflow management tools: Step Functions, Airflow
- Experience with AWS cloud services: EC2, EMR, RDS, Redshift, IAM, Security Group, VPC etc.
- Experience developing in Python and Notebooks/IDE
- Experience with Automation: Jenkins CI/CD, Terraform, CDK, Boto3
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.