Demo

Senior HPC engineer, Research infrastructure

Luma AI
Palo Alto, CA Full Time
POSTED ON 4/7/2025
AVAILABLE BEFORE 6/6/2025

Help Luma build some of the biggest & fastest AI supercomputing clusters in the world! As a High-Performance Computing engineer, you’ll work at the intersection of hardware and software, designing systems that deliver the maximum possible performance for running large-scale AI models. We work at the very cutting edge of speed and scale, combining the traditions of High-Performance Computing (HPC) in a modern cloud environment. 


For this role, it’s important you understand how to combine CPU’s, GPU’s, and network devices into systems that are then deployed at a large scale to peak efficiency. You understand the lowest levels of the software platforms that sit on top of this hardware, including how to best optimize the Linux kernel and user-space code. You are capable of writing code to automate the monitoring and healing of these systems, commanding a large number of servers with few people.

\n


Responsibilities
  • In this role, you will work closely with and directly accelerate machine learning researchers, but don't need to be a machine learning expert yourself. 
  • We value people who can quickly obtain a deep technical understanding of new domains and enjoy being self-directed and identifying the most important problems to solve. 
  • You’ll be managing training HPC clusters at Luma from provisioning to performance tuning.
  • Areas of work will include observability, distributed job tracing, GPU diagnostics, software environment management and additional tooling plus work on the actual code to enable necessary features.
  • We believe that increasing compute is a huge lever to AI progress. You will have a direct impact on our ability to grow to an unprecedented scale and likewise produce unprecedented results.


Experience
  • 8 years experience as infrastructure engineer or Devops in large and complex distributed systems.
  • Deep understanding of networking, bonus points for experience in HPC networking.
  • Experience developing high-quality software in a general-purpose programming language, preferably including Python.
  • Excellent problem-solving skills and attention to detail.
  • Experience with GPUs in large scale clusters is strongly preferred.
  • Strong knowledge of observability and monitoring in distributed systems.
  • Tenacious at troubleshooting hardware and network topology failures in distributed systemsIndependently driven and able to own problems and build solutions from end-to-end.
  • Experience with large scale data center operations, proficiency in cloud orchestration and system tools.


Compensation
  • In addition to cash base pay, you'll also receive a sizable grant of Luma's equity.
  • The pay range for this position is $180000- 220000/yr for Bay Area. Base pay offered will vary depending on job-related knowledge, skills, candidate location, and experience. 


\n
$180,000 - $220,000 a year
In addition to cash base pay, you'll also receive a sizable grant of Luma's equity.
The pay range for this position is $180000- 250000/yr for Bay Area. Base pay offered will vary depending on job-related knowledge, skills, candidate location, and experience. 
\n

Your application is reviewed by real people.

Salary : $180,000 - $220,000

If your compensation planning software is too rigid to deploy winning incentive strategies, it’s time to find an adaptable solution. Compensation Planning
Enhance your organization's compensation strategy with salary data sets that HR and team managers can use to pay your staff right. Surveys & Data Sets

What is the career path for a Senior HPC engineer, Research infrastructure?

Sign up to receive alerts about other jobs on the Senior HPC engineer, Research infrastructure career path by checking the boxes next to the positions that interest you.
Income Estimation: 
$86,680 - $110,316
Income Estimation: 
$110,730 - $135,754
Income Estimation: 
$117,033 - $148,289
Income Estimation: 
$110,730 - $135,754
Income Estimation: 
$128,617 - $162,576
Income Estimation: 
$117,033 - $148,289
Income Estimation: 
$128,617 - $162,576
Income Estimation: 
$163,289 - $195,234
Income Estimation: 
$117,033 - $148,289
Income Estimation: 
$59,440 - $93,329
Income Estimation: 
$69,043 - $113,369
View Core, Job Family, and Industry Job Skills and Competency Data for more than 15,000 Job Titles Skills Library

Job openings at Luma AI

Luma AI
Hired Organization Address Palo Alto, CA Full Time
Luma’s mission is to build multimodal AI to expand human imagination and capabilities. We believe that multimodality is ...
Luma AI
Hired Organization Address Stanford, CA Full Time
We are looking for our first members of our Business Development and Partnership team. You are a highly motivated indivi...
Luma AI
Hired Organization Address Palo Alto, CA Full Time
You will work on building generative AI inference infra at an unprecedented scale. You’ll be responsible for Luma’s REST...
Luma AI
Hired Organization Address Palo Alto, CA Full Time
We're seeking a Product Recruiter to help us build exceptional product, design, and user experience teams that will shap...

Not the job you're looking for? Here are some other Senior HPC engineer, Research infrastructure jobs in the Palo Alto, CA area that may be a better fit.

AI Infrastructure Engineer - HPC

Cisco Systems, Inc., San Jose, CA

AI Assistant is available now!

Feel free to start your new journey!