Demo

Staff Software Engineer - ML Systems & Performance

Alldus
San Francisco, CA Full Time
POSTED ON 3/3/2025
AVAILABLE BEFORE 6/3/2025

My client is looking for an experienced Staff Software Engineer to play a key role in optimizing machine learning infrastructure for generative models. This position involves designing and implementing innovative model-serving solutions on a proprietary inference engine, with a focus on improving efficiency, reducing latency, and maximizing throughput. You’ll also be responsible for developing monitoring and profiling tools to diagnose performance bottlenecks and drive system-level optimizations. This role offers the opportunity to collaborate with applied ML researchers and industry leaders to ensure their workloads are fully optimized for high-performance acceleration.

Responsibilities :

  • Contribute to advancing the performance of generative media models by optimizing model-serving infrastructure.
  • Design and implement next-generation model-serving architectures that improve efficiency, reduce processing delays, and optimize resource usage.
  • Develop performance analysis tools to identify system inefficiencies and propose enhancements.
  • Work closely with ML researchers and technical teams to ensure optimal acceleration for demanding workloads.

Requirements :

  • Strong background in systems programming and performance tuning, with experience in identifying and resolving bottlenecks.
  • Expertise in modern ML infrastructure tools such as PyTorch, TensorRT, and TransformerEngine, as well as experience with model optimization techniques like quantization and compilation.
  • Deep understanding of hardware acceleration, particularly Nvidia-based architectures, and the ability to implement low-level optimizations when needed (e.g., developing custom GEMM kernels using CUTLASS).
  • Familiarity with Triton or similar inference frameworks, along with experience in optimizing model execution on accelerators.
  • Knowledge of advanced model parallelism strategies, including hybrid approaches that integrate multiple parallelism techniques.
  • Understanding of cutting-edge ML performance optimizations, such as Ring Attention, FA3, and optimized MLP implementations.
  • This role is ideal for someone passionate about pushing the limits of ML performance and working on next-generation AI infrastructure.

    If your compensation planning software is too rigid to deploy winning incentive strategies, it’s time to find an adaptable solution. Compensation Planning
    Enhance your organization's compensation strategy with salary data sets that HR and team managers can use to pay your staff right. Surveys & Data Sets

    What is the career path for a Staff Software Engineer - ML Systems & Performance?

    Sign up to receive alerts about other jobs on the Staff Software Engineer - ML Systems & Performance career path by checking the boxes next to the positions that interest you.
    Income Estimation: 
    $97,257 - $120,701
    Income Estimation: 
    $123,167 - $152,295
    Income Estimation: 
    $146,673 - $180,130
    Income Estimation: 
    $176,149 - $220,529
    Income Estimation: 
    $97,257 - $120,701
    Income Estimation: 
    $123,167 - $152,295
    Income Estimation: 
    $77,657 - $95,021
    Income Estimation: 
    $97,257 - $120,701
    Income Estimation: 
    $123,167 - $152,295
    Income Estimation: 
    $146,673 - $180,130
    View Core, Job Family, and Industry Job Skills and Competency Data for more than 15,000 Job Titles Skills Library

    Job openings at Alldus

    Alldus
    Hired Organization Address Cambridge, MA Contractor
    Our client is one of the largest players in ServiceNow, who strive to make smarter ways of working in generating transfo...
    Alldus
    Hired Organization Address Fremont, CA Full Time
    Our client is redefining digital patient engagement by empowering patients in their healthcare journey, both inside and ...
    Alldus
    Hired Organization Address Cambridge, MA Contractor
    Our client is an Elite ServiceNow partner and they are hiring a ServiceNow (CSM) Business Process Consultant on a 6-mont...
    Alldus
    Hired Organization Address Fort Meade, MD Contractor
    We are looking for a ServiceNow Trainer for a long-term contract (3 years) in Fort Meade, MD. This hybrid role requires ...

    Not the job you're looking for? Here are some other Staff Software Engineer - ML Systems & Performance jobs in the San Francisco, CA area that may be a better fit.

    AI Assistant is available now!

    Feel free to start your new journey!