Demo

Inference Performance Engineer

Acceler8 Talent
Santa Rosa, CA Full Time
POSTED ON 3/4/2025
AVAILABLE BEFORE 5/11/2025

Join Our Inference Performance Team : Optimizing Foundation Models On-Device

We’re building the future of on-device AI by making foundation models smarter, faster, and more efficient. As part of the Inference Performance Team , you’ll work on challenging, high-impact projects to push the limits of what’s possible with foundation model inference.

What You’ll Do

  • Pinpoint performance bottlenecks and navigate quality-performance trade-offs in reference implementations (e.g., openai / whisper) and our optimized frameworks.
  • Design, prototype, and test performance improvements tailored to meet enterprise customer needs.
  • Drive innovation in our open-source inference frameworks by pitching and delivering new ideas.
  • Help expand support to new platforms—currently focused on Apple but actively growing into Android, Linux, and soon Windows.
  • Collaborate with ML Research Engineers to turn theoretical advances into practical, real-world optimizations.

Core Qualifications :

  • 3 years of industry experience working on technically challenging problems.
  • Proficiency in Python or C / C .
  • Experience with CUDA, OpenCL, or Metal.
  • A strong understanding of hardware acceleration (GPUs, NPUs, TPUs, CPUs).
  • Familiarity with modern ML frameworks like TensorFlow, PyTorch, Core ML, or ONNX.
  • Expertise in GPU kernel programming.
  • Contributions to major ML frameworks or open-source projects.
  • Why This Role?

    You’ll play a critical role in advancing the performance of foundation models across platforms like Apple, Android, and Linux—shaping the future of efficient, scalable on-device AI.

    If your compensation planning software is too rigid to deploy winning incentive strategies, it’s time to find an adaptable solution. Compensation Planning
    Enhance your organization's compensation strategy with salary data sets that HR and team managers can use to pay your staff right. Surveys & Data Sets

    What is the career path for a Inference Performance Engineer?

    Sign up to receive alerts about other jobs on the Inference Performance Engineer career path by checking the boxes next to the positions that interest you.
    Income Estimation: 
    $97,257 - $120,701
    Income Estimation: 
    $123,167 - $152,295
    Income Estimation: 
    $79,882 - $99,769
    Income Estimation: 
    $105,207 - $132,120
    Income Estimation: 
    $94,567 - $126,847
    Income Estimation: 
    $51,973 - $66,811
    Income Estimation: 
    $59,277 - $74,994
    Income Estimation: 
    $94,567 - $126,847
    Income Estimation: 
    $68,048 - $83,238
    Income Estimation: 
    $79,882 - $99,769
    Income Estimation: 
    $94,567 - $126,847
    Income Estimation: 
    $44,882 - $57,300
    Income Estimation: 
    $51,973 - $66,811
    Income Estimation: 
    $94,567 - $126,847
    View Core, Job Family, and Industry Job Skills and Competency Data for more than 15,000 Job Titles Skills Library

    Job openings at Acceler8 Talent

    Acceler8 Talent
    Hired Organization Address Sunnyvale, CA Full Time
    I am currently seeking a Senior ML Compiler Engineer to join the team, focusing on becoming the compute platform for AGI...
    Acceler8 Talent
    Hired Organization Address Denver, CO Full Time
    🚀 Machine Learning Engineer – AI & Cutting-Edge Innovation 📍 Denver | Hybrid (3 days in-office) | Relocation Available...
    Acceler8 Talent
    Hired Organization Address Denver, CO Full Time
    Technical Project Manager - Denver, CO A rapidly growing start-up who are revolutionizing tolling technology and are bac...
    Acceler8 Talent
    Hired Organization Address Colorado, CO Full Time
    Software Engineer Transform Travel with Python & TypeScript Who We Are We’re pioneering AI-driven traffic solutions. Hel...

    Not the job you're looking for? Here are some other Inference Performance Engineer jobs in the Santa Rosa, CA area that may be a better fit.

    AI Engineer & Researcher, Inference

    Speechify, Bodega, CA

    Software Performance Engineer (C++)

    Zenith Search, Santa Rosa, CA

    AI Assistant is available now!

    Feel free to start your new journey!