Demo

Engineering Manager, Inference Scalability and Capability

Anthropic
San Francisco, CA Full Time
POSTED ON 3/3/2025
AVAILABLE BEFORE 4/28/2025

About the role:

We are seeking an experienced Engineering Manager to join our Inference Scalability and Capability team. This team is responsible for building and maintaining the critical systems that serve our LLMs to a diverse set of consumers. As the cornerstone of our service delivery, the team focuses on scaling inference systems, ensuring reliability, optimizing compute resource efficiency, and developing new inference capabilities. The team tackles complex distributed systems challenges across our entire inference stack, from optimal request routing to efficient prompt caching.

Responsibilities:

  • Build and lead a high-performing team of engineers through technical mentorship, strategic hiring, and creating an environment that fosters innovation 
  • Drive operational excellence of inference systems (deployments, auto-scaling, request routing, monitoring) across cloud providers
  • Facilitate development of advanced inference features (e.g., prompt caching, constrained sampling, fine-tuning) 
  • Partner deeply with research teams to productionize new models, infrastructure teams to optimize hardware utilization, and product teams to deliver customer-facing features
  • Create clear technical roadmaps and execution strategies in a fast-moving environment while managing competing priorities

You may be a good fit if you:

  • Have 5 years of experience leading large-scale distributed systems teams
  • Have excellence in building high-trust environments and helping teams navigate technical uncertainty while maintaining velocity
  • Exhibit demonstrated ability to recruit, scale, and retain engineering talent
  • Possess outstanding communication and leadership skills
  • Show a deep commitment to advancing AI capabilities responsibly
  • Have a strong technical background enabling you to make architectural decisions and guide technical direction

Strong candidates may also have experience with:

  • Experience implementing and deploying machine learning systems at scale
  • Experience with LLM inference optimization including batching and caching strategies
  • Experience with cloud-native architectures, containerization, and deployment across multiple cloud providers
  • Familiarity with high-performance computing environments and hardware acceleration (GPU, TPU, Trn)

Deadline to apply: None. Applications will be reviewed on a rolling basis. 

If your compensation planning software is too rigid to deploy winning incentive strategies, it’s time to find an adaptable solution. Compensation Planning
Enhance your organization's compensation strategy with salary data sets that HR and team managers can use to pay your staff right. Surveys & Data Sets

What is the career path for a Engineering Manager, Inference Scalability and Capability?

Sign up to receive alerts about other jobs on the Engineering Manager, Inference Scalability and Capability career path by checking the boxes next to the positions that interest you.
Income Estimation: 
$151,448 - $188,145
Income Estimation: 
$203,425 - $249,816
Income Estimation: 
$213,375 - $267,876
Income Estimation: 
$190,687 - $235,769
Income Estimation: 
$150,358 - $188,456
Income Estimation: 
$197,066 - $250,309
Income Estimation: 
$117,033 - $148,289
Income Estimation: 
$151,448 - $188,145
Income Estimation: 
$203,425 - $249,816
Income Estimation: 
$213,375 - $267,876
Income Estimation: 
$190,687 - $235,769
Income Estimation: 
$85,996 - $102,718
Income Estimation: 
$111,859 - $131,446
Income Estimation: 
$110,457 - $133,106
Income Estimation: 
$105,809 - $128,724
Income Estimation: 
$122,763 - $145,698
Income Estimation: 
$124,420 - $155,868
Income Estimation: 
$169,105 - $208,220
Income Estimation: 
$177,520 - $228,955
Income Estimation: 
$151,448 - $188,145
Income Estimation: 
$176,972 - $219,172
View Core, Job Family, and Industry Job Skills and Competency Data for more than 15,000 Job Titles Skills Library

Job openings at Anthropic

Anthropic
Hired Organization Address San Francisco, CA Full Time
About Anthropic Anthropic’s mission is to create reliable, interpretable, and steerable AI systems. We want AI to be saf...
Anthropic
Hired Organization Address San Francisco, CA Full Time
About the role : As a Research Scientist / Engineer focused on honesty within the Finetuning Alignment team, you'll spea...
Anthropic
Hired Organization Address San Francisco, CA Full Time
About the role As a member of the Compute Capacity Strategy & Operations team, you will own initiatives that support Ant...
Anthropic
Hired Organization Address San Francisco, CA Full Time
About the role: We're looking for seasoned iOS engineers to join our Claude mobile product team and help build apps that...

Not the job you're looking for? Here are some other Engineering Manager, Inference Scalability and Capability jobs in the San Francisco, CA area that may be a better fit.

Engineering Manager, Inference Engine

The Rundown, San Francisco, CA

Engineering Manager, Production Inference

OpenAI, San Francisco, CA

AI Assistant is available now!

Feel free to start your new journey!