What are the responsibilities and job description for the Machine Learning Engineer - Inference Systems position at Alldus?

We are on the lookout for a skilled and enthusiastic Machine Learning Engineer to join our innovative ML Inference team! In this exciting role, you will have the chance to contribute to the development of cutting-edge technologies in ML / LLM inference and serving. Work alongside a talented team dedicated to building and enhancing next-generation Large Language Model (LLM) Inference Engines.

Key Responsibilities :

Develop and Enhance Inference Engine : Design, implement, and optimize a state-of-the-art LLM Inference Engine. Integrate the latest inference techniques from AI research to boost latency and throughput.
Performance Optimization : Execute deep performance optimizations across the technology stack, including PyTorch, C , and CUDA. Analyze and enhance system performance to address diverse use cases effectively.
Customer Collaboration : Engage with clients to comprehend their specific performance needs and tailor solutions. Provide technical expertise to ensure seamless deployment and operation of inference systems.
Technical Leadership : Shape the roadmap and vision for our inference technologies. Spearhead initiatives aimed at fostering innovation and ensuring our solutions remain competitive.
Infrastructure Development : Collaborate with team partners to build and sustain scalable, multi-replica serving infrastructure. Ensure the reliability and scalability of our LLM serving systems to accommodate growing workloads.

Qualifications :

Technical Skills : Proficient in systems programming with languages like C . Strong experience with machine learning frameworks, particularly PyTorch. Expertise in GPU programming and CUDA for optimizing performance. Solid grasp of AI / ML concepts, especially related to large language models.

Experience : Proven track record in developing and optimizing ML / LLM inference systems. Demonstrated ability to translate research advancements into effective production systems. Experience in performance tuning and profiling across various tech stacks. Familiarity with vLLM is a plus.

Apply for this job

Receive alerts for other Machine Learning Engineer - Inference Systems job openings

Job openings at Alldus

Large Language Model Machine Learning Researcher

Alldus

Boston, MA Full Time

Join an innovative early-stage technology company at the forefront of applying AI to life sciences. We are seeking talen...

Director of NetSuite Consulting

Alldus

Chicago, IL Full Time

Alldus are working with a leading IT consulting firm specializing in providing top-tier technology solutions to the fina...

Director of NetSuite Consulting

Alldus

Dallas, TX Full Time

Alldus are working with a leading IT consulting firm specializing in providing top-tier technology solutions to the fina...

Lead AI Engineer

Alldus

Fremont, CA Full Time

Our client is revolutionizing digital patient engagement by empowering individuals throughout their healthcare journey, ...

Not the job you're looking for? Here are some other Machine Learning Engineer - Inference Systems jobs in the San Francisco, CA area that may be a better fit.

Machine Learning Engineer - Inference Systems

What are the responsibilities and job description for the Machine Learning Engineer - Inference Systems position at Alldus?

What is the career path for a Machine Learning Engineer - Inference Systems?

Job openings at Alldus

Not the job you're looking for? Here are some other Machine Learning Engineer - Inference Systems jobs in the San Francisco, CA area that may be a better fit.

We don't have any other Machine Learning Engineer - Inference Systems jobs in the San Francisco, CA area right now.

AI Assistant is available now!