What are the responsibilities and job description for the Data Engineer with GCP position at Vital Force Solutions?

12 month contract

Job Summary

We are seeking a Senior Data Engineer to lead and execute the design, development, and maintenance of scalable data pipelines, data workflows, and machine learning feature engineering processes. The ideal candidate will have extensive experience with SQL, NoSQL, Kafka, GCP services, and data pipeline development, as well as a proven track record in optimizing data solutions for performance and scalability. This role requires a passion for driving innovation and continuous improvement while mentoring junior team members in best practices.

Key Responsibilities

Provide Technical Leadership:

Offer guidance and leadership to the team, ensuring clarity and alignment between ongoing projects.

Facilitate collaboration across teams to solve complex data engineering challenges.

Promote best practices in data engineering to ensure consistency and quality across all initiatives.

Build And Maintain Data Pipelines

Design, build, and maintain efficient, scalable, and reliable data pipelines to support data ingestion, transformation, and integration across multiple data sources and destinations.

Utilize tools like Kafka, Databricks, and other related technologies to ensure smooth data flow across systems.

Leverage GCP services such as BigQuery, Cloud Storage, Vertex AI, AutoMLOps, and Dataflow for data processing and analytics.

Drive Digital Innovation

Innovate and modernize data engineering approaches, focusing on extending core data assets (e.g., SQL-based, NoSQL-based, cloud-based, and real-time streaming data platforms).

Promote the use of cutting-edge technologies to improve the efficiency and performance of data workflows.

Implement Feature Engineering

Develop and manage feature engineering pipelines for machine learning workflows, utilizing tools like Vertex AI, BigQuery ML, and custom Python libraries.

Collaborate with data scientists to ensure that the data is transformed and prepared effectively for machine learning models.

Implement Automated Testing

Design and implement automated unit, integration, and performance testing frameworks to ensure high-quality, reliable, and scalable data solutions.

Ensure data workflows are tested for accuracy, reliability, and compliance with organizational standards.

Optimize Data Workflows

Optimize data workflows for performance, cost efficiency, and scalability in large, complex data environments.

Continuously monitor and improve the performance of data pipelines to handle large datasets effectively.

Mentor Team Members

Mentor junior team members on best practices in data engineering, guiding them on data principles, patterns, and processes.

Foster a collaborative environment that encourages skill development and knowledge sharing.

Draft And Review Documentation

Draft and review architectural diagrams, interface specifications, and other design documents to ensure clear communication and understanding of technical solutions.

Ensure documentation is thorough, up-to-date, and easily accessible for the team.

Cost/Benefit Analysis

Present opportunities for improvements, providing cost/benefit analysis to leadership to guide informed, scalable, and efficient data architecture decisions.

Experience

Required Qualifications:

4 years of professional Data Development experience.
4 years of hands-on experience with SQL and NoSQL technologies (e.g., Cassandra, MongoDB).
3 years of experience building and maintaining data pipelines and workflows.
5 years of experience with Java development.
2 years of experience developing with Python for data-related tasks.
3 years of experience with Kafka and real-time streaming data solutions.
2 years of experience in feature engineering for machine learning pipelines.

Technical Skills

Experience with GCP services such as BigQuery, Vertex AI Platform, Cloud Storage, AutoMLOps, and Dataflow.
Strong understanding of ETL processes, data warehousing, and data integration techniques.
Familiarity with CI/CD pipelines and automated testing frameworks.
Expertise in version control tools like Git and experience with GitHub Actions.

Agile Knowledge

Strong understanding of Agile principles, preferably Scrum, and ability to work in an Agile environment.

Preferred Qualifications

Streaming Technologies:

Knowledge of Structured Streaming technologies such as Spark, Kafka, EventHub, or similar technologies.

Cloud & Data Technologies

Familiarity with GitHub SaaS, Databricks, and PySpark for data processing and analysis.
Experience with Spark development and knowledge of distributed computing.

Machine Learning Integration

Experience integrating machine learning models with data pipelines, especially in cloud environments.

Certifications (Optional)

Google Professional Data Engineer (Preferred)

AWS Certified Big Data – Specialty (Preferred)

Databricks Certified Associate Developer (Preferred)

Education: Bachelors Degree

Skills: automated testing frameworks,kafka,data engineering,data pipeline development,nosql,python,pipelines,data warehousing,ci/cd pipelines,gcp,etl processes,gcp services,java,git,databricks,data integration techniques,machine learning,sql,learning,agile principles

Salary : $40 - $45

Apply for this job

Receive alerts for other Data Engineer with GCP job openings

Data Engineer with GCP

What are the responsibilities and job description for the Data Engineer with GCP position at Vital Force Solutions?

What is the career path for a Data Engineer with GCP?

Job openings at Vital Force Solutions

Not the job you're looking for? Here are some other Data Engineer with GCP jobs in the Cincinnati, OH area that may be a better fit.

We don't have any other Data Engineer with GCP jobs in the Cincinnati, OH area right now.

AI Assistant is available now!