What are the responsibilities and job description for the Machine Learning Engineer - Inference Systems position at Alldus?
We are on the lookout for a skilled and enthusiastic Machine Learning Engineer to join our innovative ML Inference team! In this exciting role, you will have the chance to contribute to the development of cutting-edge technologies in ML / LLM inference and serving. Work alongside a talented team dedicated to building and enhancing next-generation Large Language Model (LLM) Inference Engines.
Key Responsibilities :
- Develop and Enhance Inference Engine : Design, implement, and optimize a state-of-the-art LLM Inference Engine. Integrate the latest inference techniques from AI research to boost latency and throughput.
- Performance Optimization : Execute deep performance optimizations across the technology stack, including PyTorch, C , and CUDA. Analyze and enhance system performance to address diverse use cases effectively.
- Customer Collaboration : Engage with clients to comprehend their specific performance needs and tailor solutions. Provide technical expertise to ensure seamless deployment and operation of inference systems.
- Technical Leadership : Shape the roadmap and vision for our inference technologies. Spearhead initiatives aimed at fostering innovation and ensuring our solutions remain competitive.
- Infrastructure Development : Collaborate with team partners to build and sustain scalable, multi-replica serving infrastructure. Ensure the reliability and scalability of our LLM serving systems to accommodate growing workloads.
Qualifications :