What are the responsibilities and job description for the ML Performance Engineer position at Oumi?
About Oumi
Why we exist: Oumi is on a mission to make frontier AI truly open for all. We are founded on the belief that AI will have a transformative impact on humanity, and that developing it collectively, in the open, is the best path forward to ensure that it is done efficiently and safely.
What we do: Oumi provides an all-in-one platform to build state-of-the-art AI models, end to end, from data preparation to production deployment, empowering innovators to build cutting-edge models at any scale. Oumi also develops open foundation models in collaboration with academic collaborators and the open community.
Our Approach: Oumi is fundamentally an open-source first company, with open-collaboration across the community as a core principle. Our work is:
The ML Performance Engineer will be an integral part of Oumi's research team, focusing on optimizing and accelerating training and inference with AI models. This role involves developing efficient CUDA/Triton kernels, contributing to open-source projects, and collaborating with researchers and engineers to improve model performance. Engineers at Oumi will work on various aspects of model acceleration including kernel optimization, memory management, and performance profiling.
What you’ll bring:
Why we exist: Oumi is on a mission to make frontier AI truly open for all. We are founded on the belief that AI will have a transformative impact on humanity, and that developing it collectively, in the open, is the best path forward to ensure that it is done efficiently and safely.
What we do: Oumi provides an all-in-one platform to build state-of-the-art AI models, end to end, from data preparation to production deployment, empowering innovators to build cutting-edge models at any scale. Oumi also develops open foundation models in collaboration with academic collaborators and the open community.
Our Approach: Oumi is fundamentally an open-source first company, with open-collaboration across the community as a core principle. Our work is:
- Open Source First: All our platform and core technology is open source
- Research-driven: We conduct and publish original research in AI, collaborating with our community of academic research labs and collaborators
- Community-powered: We believe in the power of open-collaboration and welcome contributions from researchers and developers worldwide
The ML Performance Engineer will be an integral part of Oumi's research team, focusing on optimizing and accelerating training and inference with AI models. This role involves developing efficient CUDA/Triton kernels, contributing to open-source projects, and collaborating with researchers and engineers to improve model performance. Engineers at Oumi will work on various aspects of model acceleration including kernel optimization, memory management, and performance profiling.
What you’ll bring:
- ML Performance: Demonstrated experience optimizing models, training & inference pipelines, and familiarity with profiling tools (NSight, nvprof)
- Programming Skills: Strong programming skills in one of Python, C or Rust
- Systems Knowledge: familiarity with low-level operating systems foundations, PyTorch internals, GPU architectures is highly desirable
- ML Expertise: Deep understanding of machine learning and deep learning concepts, with specific knowledge of large language models (LLMs).
- Open Source: Familiarity with open-source projects and a passion for contributing to the open-source community.
- Values: Share Oumi's values: Beneficial for all, Customer-obsessed, Radical Ownership, Exceptional Teammates, Science-grounded.
- Competitive salary: $120,000 - $220,000
- Equity in a high-growth startup
- Comprehensive health, dental and vision insurance
- 21 days PTO
- Regular team offsites and events
Salary : $120,000 - $220,000