What are the responsibilities and job description for the Inference Performance Engineer position at Acceler8 Talent?
Join Our Inference Performance Team : Optimizing Foundation Models On-Device
We’re building the future of on-device AI by making foundation models smarter, faster, and more efficient. As part of the Inference Performance Team , you’ll work on challenging, high-impact projects to push the limits of what’s possible with foundation model inference.
What You’ll Do
- Pinpoint performance bottlenecks and navigate quality-performance trade-offs in reference implementations (e.g., openai / whisper) and our optimized frameworks.
- Design, prototype, and test performance improvements tailored to meet enterprise customer needs.
- Drive innovation in our open-source inference frameworks by pitching and delivering new ideas.
- Help expand support to new platforms—currently focused on Apple but actively growing into Android, Linux, and soon Windows.
- Collaborate with ML Research Engineers to turn theoretical advances into practical, real-world optimizations.
Core Qualifications :
Why This Role?
You’ll play a critical role in advancing the performance of foundation models across platforms like Apple, Android, and Linux—shaping the future of efficient, scalable on-device AI.