What are the responsibilities and job description for the Data Engineer position at TechMatrix Inc?
We are seeking a skilled and dynamic Data Engineer to join our innovative data analytics team. It's on only W2. LOCALS ONLY The ideal candidate will have strong experience in developing and optimizing data pipelines, cloud services, and modern web application frameworks. The role will focus on building scalable data solutions using AWS cloud services, Python, PySpark, and other key tools to drive business insights and operational excellence.
Key Responsibilities:
- Design, develop, and maintain robust data pipelines using Python and PySpark for processing and transforming large datasets.
- Build and deploy scalable data workflows using AWS Glue, EMR, Lambda, SNS, and other AWS services.
- Develop serverless applications for data management with Node.js and create user-friendly front-end solutions using Vue.js.
- Manage and optimize data storage solutions with S3 and DynamoDB for efficient access and data processing.
- Collaborate with data scientists and analysts to provide reliable datasets for analysis and model development.
- Ensure security, reliability, and scalability of data infrastructure in production environments.
- Implement best practices for CI/CD pipelines and cloud architecture governance.
- Monitor and troubleshoot data pipeline issues, optimizing performance where necessary.
- Maintain comprehensive documentation for data workflows, architecture, and configurations.
- Use Databricks to build advanced data processing solutions and manage big data analytics.
Required Skills:
- Proficiency in Python and PySpark for data engineering tasks.
- Hands-on experience with cloud services, particularly AWS Glue, EMR, Lambda, S3, SNS, and DynamoDB.
- Strong understanding of data architecture, ETL processes, and distributed computing.
- Experience developing APIs and serverless applications with Node.js and building UI components with Vue.js.
- Familiarity with data orchestration and workflow automation.
- Knowledge of relational and non-relational databases.
- Hands-on experience with Databricks for data pipelines and analytics.
- Strong problem-solving skills and the ability to troubleshoot data integration issues.