What are the responsibilities and job description for the MLOps Technical Specialist position at Acceler8 Talent?
Who we are :
Re-founded in March of 2024, we have assembled a team of kind, innovative, and collaborative professionals dedicated to shaping the future of enterprise AI. We celebrate diverse ideas and approaches as we tackle some of the most challenging problems in the industry.
Our flagship product is an empathetic conversational chatbot built on our advanced 350B frontier model, which has been refined through sophisticated fine-tuning, inference, and orchestration techniques. As we scale our solutions, our infrastructure must evolve to meet the rigorous demands of production environments.
About the Role
As a MLOps Technical Specialist on our ML Infrastructure team, you will play a vital role in designing, building, and operating the systems that power our machine learning workflows—from model training to production deployment. Your efforts will be essential in developing control planes and robust tools around ML services, ensuring our platform is scalable, secure, and resilient. We seek candidates with production operations experience, a strong open-source background, and hands-on expertise in managing distributed clusters.
This role is ideal for you if you :
- Have significant experience operating ML systems in production and in building tools to manage them effectively.
- Are highly skilled in managing distributed clusters using technologies such as Kubernetes (K8s), SLURM, and Ray.
- Possess a robust open-source background, ideally from top-tier companies, and are comfortable utilizing both community-driven and proprietary solutions.
- Are knowledgeable about security best practices for safeguarding production systems, even if security is not your exclusive focus.
- Thrive in dynamic, innovative environments where pushing the boundaries of ML infrastructure is a daily challenge.
Responsibilities include :