What are the responsibilities and job description for the Founding Senior Data / ML Infrastructure Engineer position at Coco Robotics?
At Coco , our mission is to revolutionize urban logistics by empowering cities, boosting local economies, and delivering delightful customer experiences. We connect people with local restaurants through our fleet of on-demand delivery robots, helping merchants reach their customers faster and more efficiently. By building innovative robotic systems that seamlessly navigate city sidewalks, Coco plays a key role in reshaping the future of last-mile delivery and enhancing local businesses.
To deliver on our mission, we are building an autonomy team to develop the AI technology that will enable our robot pilots to scale efficiently, sustainably, and safely. The involves building an autonomy stack ground-up based on our millions of miles of last-mile delivery routes, proprietary video streams, and LiDAR data.
What is the scope of this role?
As a Founding Data & ML Infrastructure Engineer , you will be responsible to stand up Coco's autonomy stack alongside the CTO and fellow team members in the autonomy team. You will be responsible for developing and maintaining the infrastructure that supports the collection, processing, management, and training of large-scale datasets for our autonomous robots. The impact of this will be massive improvements to our robot-to-pilot ratio thereby allowing every person living in an urban area to benefit from last-mile delivery. In this role, you must accomplish the following :
- Design and implement a high-performance data engine to mine and identify valuable data samples that enhance model training.
- Build tools and pipelines for automatically extracting, cleaning, and curating data from various sources (sensors, logs, real-world interactions).
- Enable seamless interaction with large-scale datasets, ensuring that the team can quickly retrieve and analyze data to drive insights.
- Collaborate with the autonomy and AI engineers to develop the query layer and workflows for training and testing models
- Build and maintain tools for dataset management , including data exploration, versioning, and interaction tools.
- Architect and manage the infrastructure for model training and experimentation. This includes continuously optimizing data pipelines and infra for cost, scalability, and speed.
- Create and maintain systems for dataset tracking and governance to ensure consistent and reproducible experiments.
Must have competencies :