Demo

Software Engineer - Cloud Engineering, Kubernetes

Kumo
Mountain View, CA Full Time
POSTED ON 4/10/2025
AVAILABLE BEFORE 6/10/2025

The Cloud Infrastructure team at Kumo is responsible for managing and scaling our Kubernetes-based, cloud-native AI platform across multiple cloud providers. They set service level objectives, optimize resource allocation, enforce security compliance, and drive cost efficiency for the Multi-Cloud Platform.


As a key team member, you will architect and operate a highly scalable, resilient Kubernetes infrastructure to support massive Big Data and AI workloads. You’ll design and implement advanced cluster management strategies, fleet capacity scaling, optimize workload scheduling, and enhance observability at scale. Your expertise in Kubernetes internals, networking, and performance tuning will be critical in ensuring high availability and seamless scaling.


Joining early, you'll play a pivotal role in shaping platform reliability, automating infrastructure, and enabling ML engineers with efficient commit-to-production automation, Continuous Provisioning, CI/CD, ML Ops, and deployment orchestration and workflows. You'll collaborate with ML scientists, product engineers, and leadership to influence scaling strategies, develop self-service tooling, and drive multi-cloud resilience. Engineers at Kumo take ownership of core system design, building infrastructure that powers the next generation of AI applications.

\n


Key Responsibilities
  • Design, build, and scale Kubernetes-based infrastructure to support Kumo’s multi-cloud AI platform, ensuring high availability, resilience, and performance.
  • Architect and optimize large-scale Kubernetes clusters, improving scheduling, networking (CNI), and workload orchestration for production environments.
  • Develop and extend Kubernetes controllers and operators to automate cluster management, lifecycle operations, and scaling strategies.
  • Enhance observability, diagnostics, and monitoring by building tools for real-time cluster health tracking, alerting, and performance tuning.
  • Lead efforts to automate fleet management, optimizing node pools, autoscaling, and multi-cluster deployments across AWS, GCP, and Azure.
  • Define and implement Kubernetes security policies, RBAC models, and best practices to ensure compliance and platform integrity.
  • Collaborate with ML engineers and platform teams to optimize Kubernetes for machine learning workloads, ensuring seamless resource allocation for AI/ML models.
  • Drive commit-to-production automation, cloud connectivity, and deployment orchestration, ensuring seamless application rollouts, zero-downtime upgrades, and global infrastructure reliability.


Required Skills and Experience
  • Kubernetes Mastery: 5-7 years of experience managing large-scale Kubernetes clusters (EKS, GKE, AKS, or OpenSource) in production. Deep expertise in Kubernetes internals, including controllers, operators, scheduling, networking (CNI), and security policies.
  • Cloud-Native Infrastructure: 5-7 years of experience building cloud-native Kubernetes-based infrastructure across AWS, Azure, and GCP.
  • Platform Engineering: 5-7 years of experience building Kubernetes service meshes (Istio/Envoy, Traefik), networking policies (Calico/Tigera), and distributed ingress/egress control.
  • Fleet Management & Scaling: Proven experience in optimizing, scaling, and maintaining Kubernetes clusters across multi-cloud environments, ensuring high availability and performance.
  • Software Development: 5-7 years of experience writing production-grade controllers and operators in Python, Go, or Rust to extend Kubernetes functionality.
  • Infrastructure-as-Code & Automation: Hands-on experience with Terraform, CloudFormation, Ansible, BASH and Make scripting to automate Kubernetes cluster provisioning and management.
  • Distributed Systems & SaaS: Expertise in building and operating large-scale distributed systems for cloud-native B2B SaaS applications running on Kubernetes.
  • Cloud Application Deployment: Deep expertise in building of container orchestration, workload scheduling, and runtime optimizations using Kubernetes, Argo or Flux.
  • Education: BS/MS in Computer Science or a related field (PhD preferred)


Nice to Have
  • Proficiency with cloud platforms such as AWS, GCP, or Azure.
  • Familiarity with chaos engineering tools and practices for testing system resilience.
  • Strong understanding of security best practices and compliance standards (GDPR, SOC2, ISO27001, vulnerability assessments, GRC, risk management).
  • Contributions to open-source projects, particularly in the Kubernetes or cloud-native ecosystem.
  • Expertise in Docker, Kubernetes, Jenkins, Flux, Argo, and Terraform in a Linux environment.
  • Hands-on experience with monitoring and observability tools such as Prometheus and Grafana.
  • Ability to develop customer-facing web frontends or public APIs/SDKs for platform services.


Benefits
  • Competitive salary and equity options.
  • Comprehensive medical and dental insurance.
  • An inclusive, diverse work environment where all employees are valued and supported.


\n

We are an equal opportunity employer and value diversity at our company. We do not discriminate on the basis of race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status.

If your compensation planning software is too rigid to deploy winning incentive strategies, it’s time to find an adaptable solution. Compensation Planning
Enhance your organization's compensation strategy with salary data sets that HR and team managers can use to pay your staff right. Surveys & Data Sets

What is the career path for a Software Engineer - Cloud Engineering, Kubernetes?

Sign up to receive alerts about other jobs on the Software Engineer - Cloud Engineering, Kubernetes career path by checking the boxes next to the positions that interest you.
Income Estimation: 
$176,149 - $220,529
Income Estimation: 
$156,679 - $196,968
Income Estimation: 
$103,114 - $138,258
Income Estimation: 
$118,163 - $145,996
Income Estimation: 
$120,777 - $151,022
Income Estimation: 
$129,363 - $167,316
Income Estimation: 
$86,891 - $130,303
Income Estimation: 
$81,253 - $112,554
Income Estimation: 
$89,966 - $112,616
Income Estimation: 
$95,407 - $122,738
Income Estimation: 
$103,114 - $138,258
Income Estimation: 
$86,891 - $130,303
Income Estimation: 
$77,657 - $95,021
Income Estimation: 
$97,257 - $120,701
Income Estimation: 
$129,363 - $167,316
Income Estimation: 
$145,845 - $177,256
Income Estimation: 
$147,836 - $182,130
Income Estimation: 
$154,597 - $194,610
Income Estimation: 
$86,891 - $130,303

Sign up to receive alerts about other jobs with skills like those required for the Software Engineer - Cloud Engineering, Kubernetes.

Click the checkbox next to the jobs that you are interested in.

  • Bug/Defect Analysis Skill

    • Income Estimation: $72,620 - $96,681
    • Income Estimation: $74,092 - $105,774
  • Computer Simulation Skill

    • Income Estimation: $100,999 - $135,435
    • Income Estimation: $108,127 - $132,532
View Core, Job Family, and Industry Job Skills and Competency Data for more than 15,000 Job Titles Skills Library

Job openings at Kumo

Kumo
Hired Organization Address Mountain View, CA Full Time
Come and change the world of AI with the Kumo team! Companies spend millions of dollars to store terabytes of data in da...
Kumo
Hired Organization Address Mountain View, CA Full Time
About Kumo.ai Kumo.ai is redefining enterprise AI with foundation models for relational data , enabling businesses to ma...

Not the job you're looking for? Here are some other Software Engineer - Cloud Engineering, Kubernetes jobs in the Mountain View, CA area that may be a better fit.

AI Assistant is available now!

Feel free to start your new journey!