What are the responsibilities and job description for the Data Engineer position at IT Minds LLC?
Data Engineer
Location : Bentonville, AR.(day 1 onsite)
Responsibilities
Location : Bentonville, AR.(day 1 onsite)
Responsibilities
- Design, develop, and maintain robust and scalable ETL workflows and data pipelines using tools like Hive, Spark, and Airflow.
- Implement and manage data storage and processing solutions using Apache Hudi and BigQuery.
- Develop and optimize data pipelines for structured and unstructured data in GCP environments, leveraging GCS for data storage.
- Write clean, maintainable, and efficient code in Scala and Python to process and transform data.
- Ensure data quality, integrity, and consistency by implementing appropriate data validation and monitoring techniques.
- Work with cross-functional teams to understand business requirements and deliver data solutions that drive insights and decision-making.
- Troubleshoot and resolve performance and scalability issues in data processing and pipelines.
- Stay updated with the latest developments in big data technologies and tools and incorporate them into the workflow as appropriate.
- Proven experience as a Data Engineer, preferably in a big data environment.
- Expertise in Hive, Spark, and Apache Hudi for big data processing and storage.
- Hands-on experience with BigQuery and Google Cloud Platform (GCP) services such as GCS, Dataflow, and Pub/Sub.
- Strong programming skills in Scala and Python, with experience in building data pipelines and ETL processes.
- Proficiency with workflow orchestration tools like Apache Airflow.
- Solid understanding of data warehousing concepts, data modelling, and schema design.
- Knowledge of distributed systems and parallel processing.
- Strong problem-solving skills and ability to work with large datasets in a fast-paced environment.