Demo

Distributed ML Systems Engineer- Inference

Together AI
San Francisco, CA Full Time
POSTED ON 3/24/2025
AVAILABLE BEFORE 6/21/2025
Role

Together AI is seeking a Distributed ML Systems Engineer to design and build scalable machine learning systems that power our accelerated AI initiatives. This role involves developing large-scale, fault-tolerant distributed systems that handle high-load and high-performance requirements. If you are passionate about designing ML systems that operate at scale and eager to create impactful solutions, we want to hear from you. This position offers the chance to work closely with our AI researchers and infrastructure teams to ensure our systems are robust and efficient. Join us in shaping the future at Together AI!

Responsibilities

  • Design and build large-scale, distributed machine learning systems that are fault-tolerant and high-performance.
  • Develop and optimize distributed processing frameworks and storage systems.
  • Collaborate with researchers, engineers, and product managers to integrate ML systems into our infrastructure.
  • Conduct architecture and design reviews to ensure best practices in system design.
  • Implement robust monitoring and logging systems to ensure the health and performance of our ML systems.

Requirements

  • 3 years of experience in building large-scale, fault-tolerant, high-performance distributed systems.
  • Strong programming skills in one or more of Python, Go, Rust, or C/C .
  • Excellent understanding of low-level operating systems concepts including multi-threading, memory management, networking, and storage, performance, and scale.
  • Experience with cloud computing platforms (AWS, GCP, Azure etc.) and large-scale infrastructure.
  • Strong problem-solving skills and ability to work in a fast-paced environment.
  • Preferred: Experience with Kubernetes
  • Preferred: Experience with Pytorch

About Together AI

Together AI is a research-drven artificial intelligence company. We believe open and transparent AI systems will drive innovation and create the best outcomes for society. Together, we are on a mission to significantly lower the cost of modern AI systems by co-designing software, hardware, algorithms, and models. We have contributed to leading open-source research, models, and datasets to advance the frontier of AI. Our team has been behind technological advancements such as FlashAttention, Hyena, FlexGen, and RedPajama. We invite you to join a passionate group of researchers and engineers in our journey to build the next-generation AI infrastructure.

Compensation

We offer competitive compensation, startup equity, health insurance, and other competitive benefits. The US base salary range for this full-time position is $160,000 - $230,000 equity benefits. Our salary ranges are determined by location, level, and role. Individual compensation will be determined by experience, skills, and job-related knowledge.

Equal Opportunity

Together AI is an Equal Opportunity Employer and is proud to offer equal employment opportunities to everyone regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity, veteran status, and more.

Salary : $160,000 - $230,000

If your compensation planning software is too rigid to deploy winning incentive strategies, it’s time to find an adaptable solution. Compensation Planning
Enhance your organization's compensation strategy with salary data sets that HR and team managers can use to pay your staff right. Surveys & Data Sets

What is the career path for a Distributed ML Systems Engineer- Inference?

Sign up to receive alerts about other jobs on the Distributed ML Systems Engineer- Inference career path by checking the boxes next to the positions that interest you.
Income Estimation: 
$39,552 - $43,488
Income Estimation: 
$65,164 - $84,765
Income Estimation: 
$73,982 - $103,127
Income Estimation: 
$46,437 - $54,409
Income Estimation: 
$98,104 - $130,419
Income Estimation: 
$70,609 - $91,165
Income Estimation: 
$86,680 - $110,316
Income Estimation: 
$117,033 - $148,289
Income Estimation: 
$86,680 - $110,316
Income Estimation: 
$110,730 - $135,754
Income Estimation: 
$117,033 - $148,289
Income Estimation: 
$110,730 - $135,754
Income Estimation: 
$128,617 - $162,576
Income Estimation: 
$117,033 - $148,289
Income Estimation: 
$59,440 - $93,329
Income Estimation: 
$69,043 - $113,369
View Core, Job Family, and Industry Job Skills and Competency Data for more than 15,000 Job Titles Skills Library

Job openings at Together AI

Together AI
Hired Organization Address San Francisco, CA Full Time
Solutions Architect Location : San Francisco, CA (Hybrid) About the role : As a Solutions Architect at Together AI, you ...
Together AI
Hired Organization Address San Francisco, CA Full Time
As an AI Researcher, you will be pushing the frontier of foundation model research and make them a reality in products. ...
Together AI
Hired Organization Address San Francisco, CA Full Time
As a Senior Network Engineer at Together, you are responsible for designing, implementing, and maintaining our network i...
Together AI
Hired Organization Address San Francisco, CA Full Time
Location : San Francisco, CA (Hybrid) About the role : As a Technical Account Manager at Together AI, you will work with...

Not the job you're looking for? Here are some other Distributed ML Systems Engineer- Inference jobs in the San Francisco, CA area that may be a better fit.

Distributed ML Systems Engineer- Inference

TBWA\Chiat\Day, San Francisco, CA

Senior Distributed ML Systems Engineer

Kuzco, San Francisco, CA

AI Assistant is available now!

Feel free to start your new journey!