What are the responsibilities and job description for the Senior Data Engineer - ML position at Harnham?
SENIOR DATA ENGINEER
COMPUTER VISION STARTUP
$150,000 - $175,000 EQUITY
REMOTE - (CST or EST)
THE COMPANY
Our client is revolutionizing operational safety in critical industries. They specialize in leveraging computer vision to provide real-time insights for the Construction industry. Their mission is to eliminate failures due to inadequate visual inspection, ensuring safety, reliability, and compliance.
ROLE OVERVIEW - Senior Data Engineer
As a Senior Data Engineer, you'll be at the forefront of developing and optimizing scalable data pipelines that support machine learning and analytics applications. You'll architect robust data infrastructure, collaborate with ML engineers, and ensure the efficient storage and processing of large-scale industrial and vision data. This is a fully remote role with an emphasis on working with an experienced, senior engineering team. Further responsibilities include:
- Design & Build Scalable Data Pipelines: Leverage tools like Airflow to develop efficient data pipelines for large-scale industrial and vision data.
- Optimize Storage Solutions: Improve the storage and management of large datasets, ensuring scalability and performance.
- Develop Real-Time & Batch Data Ingestion Frameworks: Handle images, videos, and metadata ingestion for real-time processing.
- Collaborate with ML Engineers: Structure data for optimal model training and experimentation.
- Deploy & Manage Data Pipelines: Use Kubernetes to orchestrate data pipelines efficiently.
- Automate CI/CD Workflows: Utilize GitLab CI/CD for automating deployments of data infrastructure.
- Implement Data Governance Best Practices: Ensure data quality, lineage tracking, and effective metadata management.
- Provide Technical Leadership: Guide future scaling strategies for data infrastructure.
IDEAL CANDIDATE PROFILE
- Expertise in Python (Pandas, PyArrow, Dask, etc.).
- Hands-on experience with data orchestration tools (Dagster, Prefect, Airflow).
- Proficient in Kubernetes for data pipeline orchestration.
- Experience deploying infrastructure using Terraform or similar IaC tools.
- Cloud experience (preferably AWS: S3, EKS, Lambda, Glue, RDS).
- Familiarity with streaming and event-driven architectures (Kafka, Kinesis, Pulsar, etc.).
- Strong skills in databases (SQL, NoSQL, Postgres, BigQuery, ClickHouse).
- CI/CD expertise with GitLab.
- Exposure to ML workflows is a plus, especially working closely with ML teams.
COMPENSATION & BENEFITS
As Senior Data Engineer, you will earn up to $175,000 plus equity and benefits.
HOW TO APPLY
Please register your interest by sending your resume to Danny Macdonald via the Apply link on this page.
KEYWORDS
Data Engineer, Machine Learning, Data Pipelines, Dagster, AWS, Kubernetes, Terraform, Python, CI/CD, Data Governance, Cloud Infrastructure
Salary : $175,000