Demo

Staff Site Reliability Engineer, AI Platform

Tesla Motors
Palo Alto, CA Full Time
POSTED ON 3/3/2025
AVAILABLE BEFORE 5/1/2025

Job Details

As a Site Reliability Engineer (SRE) for the AI Platform team, you will manage bleeding-edge bare-metal servers for Tesla's advanced generative AI platform. You will be responsible for the imaging, configuration management, observability, security, and scalability of these systems. You'll also manage the model benchmarks and their outputs. You should have a focus on automating anything required of this AI platform team and use various platforms to make it as easy as possible for the software engineers on the team to run their services reliably on the bare-metal platform.

Responsibilities
  • Help image bare-metal servers
  • Building tooling around it, evaluating its usage, and helping to ensure its reliability, availability and security
  • Design software and systems that enable the generative AI platform at Tesla
  • Assist the AI Platform team with onboarding and integrating services into the Tesla stack (Kubernetes/VMWare/Bare-metal)
  • Ensuring best practices and observability of the service, such as metrics, logging, tracing, and alerting
  • Automate configuration and deployment of services
  • Consult on and design infrastructure, systems and software architecture

Requirements
  • Experience with bare-metal imaging and management
  • Expert skills in Linux and its administration (Ubuntu 22.04/24.04)
  • Experience in a high-level language such as Go, Python and/or Java
  • Observability (OpenTelemetry, Prometheus, AlertManager, Grafana, Jaeger, and Splunk)
  • Infrastructure as Code (Ansible) and CI/CD pipeline experience (GitHub Actions, Jenkins)
  • Artifact management (Artifactory)
  • Strong bias for action vs endless planning, willing to get hands dirty and make mistakes sometimes
  • Habitual documenter and spreader of knowledge
  • Willing to mentor other team members and engineers with less SRE type knowledge
  • Comfortable on an on-call rotation and doing live troubleshooting of issues on NOC bridges/outage calls

Compensation and Benefits
Benefits

Along with competitive pay, as a full-time Tesla employee, you are eligible for the following benefits at day 1 of hire:
  • Aetna PPO and HSA plans > 2 medical plan options with $0 payroll deduction
  • Family-building, fertility, adoption and surrogacy benefits
  • Dental (including orthodontic coverage) and vision plans, both have options with a $0 paycheck contribution
  • Company Paid (Health Savings Account) HSA Contribution when enrolled in the High Deductible Aetna medical plan with HSA
  • Healthcare and Dependent Care Flexible Spending Accounts (FSA)
  • 401(k) with employer match, Employee Stock Purchase Plans, and other financial benefits
  • Company paid Basic Life, AD&D, short-term and long-term disability insurance
  • Employee Assistance Program
  • Sick and Vacation time (Flex time for salary positions), and Paid Holidays
  • Back-up childcare and parenting support resources
  • Voluntary benefits to include: critical illness, hospital indemnity, accident insurance, theft & legal services, and pet insurance
  • Weight Loss and Tobacco Cessation Programs
  • Tesla Babies program
  • Commuter benefits
  • Employee discounts and perks program
    • Expected Compensation

      $164,480 - $355,920/annual salary cash and stock awards benefits

      Pay offered may vary depending on multiple individualized factors, including market location, job-related knowledge, skills, and experience. The total compensation package for this position may also include other elements dependent on the position offered. Details of participation in these benefit plans will be provided if an employee receives an offer of employment.
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.

Salary : $164,480 - $355,920

If your compensation planning software is too rigid to deploy winning incentive strategies, it’s time to find an adaptable solution. Compensation Planning
Enhance your organization's compensation strategy with salary data sets that HR and team managers can use to pay your staff right. Surveys & Data Sets

What is the career path for a Staff Site Reliability Engineer, AI Platform?

Sign up to receive alerts about other jobs on the Staff Site Reliability Engineer, AI Platform career path by checking the boxes next to the positions that interest you.
Income Estimation: 
$76,670 - $90,826
Income Estimation: 
$91,609 - $118,978
Income Estimation: 
$92,877 - $110,401
Income Estimation: 
$76,670 - $90,826
Income Estimation: 
$91,609 - $118,978
Income Estimation: 
$92,877 - $110,401
Income Estimation: 
$92,877 - $110,401
Income Estimation: 
$120,933 - $155,034
Income Estimation: 
$114,618 - $136,401
Income Estimation: 
$82,762 - $100,977
Income Estimation: 
$95,852 - $118,073
Income Estimation: 
$120,143 - $165,703
Income Estimation: 
$114,618 - $136,401
Income Estimation: 
$144,264 - $191,312
Income Estimation: 
$140,435 - $166,410
View Core, Job Family, and Industry Job Skills and Competency Data for more than 15,000 Job Titles Skills Library

Job openings at Tesla Motors

Tesla Motors
Hired Organization Address Sparks, NV Full Time
Job Category Construction & Facilities Location Sparks, Nevada Req. ID 225398 Job Type Full-time Tesla is an Equal Oppor...
Tesla Motors
Hired Organization Address Sparks, NV Full Time
Job Category Engineering & Information Technology Location Sparks, Nevada Req. ID 234254 Job Type Full-time Tesla is an ...
Tesla Motors
Hired Organization Address Sparks, NV Full Time
Job Category Engineering & Information Technology Location Sparks, Nevada Req. ID 233853 Job Type Full-time Tesla is an ...
Tesla Motors
Hired Organization Address Sparks, NV Full Time
Job Category Construction & Facilities Location Sparks, Nevada Req. ID 238559 Job Type Full-time Tesla is an Equal Oppor...

Not the job you're looking for? Here are some other Staff Site Reliability Engineer, AI Platform jobs in the Palo Alto, CA area that may be a better fit.

Staff Cloud DevOps/Site Reliability Engineer

Inworld AI, Mountain View, CA

AI Assistant is available now!

Feel free to start your new journey!