Demo

Site Reliability Engineer

hireVouch
Remote, OR Remote Full Time
POSTED ON 4/20/2025
AVAILABLE BEFORE 5/16/2025

Senior Site Reliability Engineer

Position Overview

We are a mid-size entertainment company delivering captivating digital experiences to millions of customers worldwide. Our IT organization powers the infrastructure and systems behind our cutting-edge payroll and accounting applications. We are seeking a Senior Site Reliability Engineer (SRE) to enhance the performance, scalability, and reliability of our infrastructure and help bring our next-generation solutions to life.

As a Senior Site Reliability Engineer, you will ensure the reliability and scalability of our Infrastructure. You will leverage your skills in cloud technologies, infrastructure operations, Kubernetes orchestration, application development, database administration, Oracle E-Business Suite (EBS), and maintain robust infrastructure that supports business-critical platforms. This role will also involve collaboration with cross-functional teams to implement engineering best practices, monitoring and automation while exploring opportunities to enhance operations with emerging AI technologies.

Key Responsibilities

  • Infrastructure as Code :  Develop and maintain automated infrastructure provisioning with  Terraform  for hybrid cloud environments.
  • Cloud Expertise :  Design and manage robust multi-cloud environments using  AWS  and  Azure , with a focus on optimizing Kubernetes clusters ( EKS  and  AKS ).
  • Oracle E-Business Suite (EBS) :  Support, optimize, and ensure the reliability of  Oracle EBS  deployments, integrating it with other IT systems to maintain smooth business operations.
  • Operating Systems Management :  Administer and optimize  Linux (RHEL)  and  Windows Server  environments to ensure high availability and security.
  • Application Performance :  Collaborate with development teams to enhance applications built on  React, Node.js, .NET, C#, and Java  for reliability and performance.
  • Networking & Security :  Leverage  advanced AWS networking skills  to implement secure and scalable architectures, including VPC design, load balancing, and advanced routing.
  • Database Optimization :  Monitor and tune database performance and manage relational and NoSQL databases to support high-traffic entertainment services.
  • Monitoring & Troubleshooting :  Implement observability tools and proactively address performance issues using platforms like Prometheus, Grafana, Splunk, or CloudWatch.
  • Incident Response & Automation :  Lead incident management, postmortem reviews, and automation efforts to prevent recurrence and improve overall resilience.
  • Cross-Team Collaboration :  Work closely with developers, system administrators, and security teams to align infrastructure needs with business and technical goals.

Qualifications

Required Technical Skills

  • Expert-level knowledge of  Terraform  for infrastructure automation.
  • Hands-on experience managing  Azure Kubernetes Services (AKS)  and  AWS Kubernetes Services (EKS)  clusters.
  • Advanced knowledge of  AWS  and  Azure  cloud ecosystems, including networking, security, and cost optimization.
  • Proficiency in  Linux (RHEL)  and  Windows Server  environments.
  • Proven experience supporting and optimizing  Oracle E-Business Suite (EBS)  in a complex IT environment.
  • Proven application development experience with  React, Node.js, .NET, C#, and Java .
  • Strong database administration and performance-tuning skills for both relational (e.g., MySQL, PostgreSQL, MSSQL) and NoSQL (e.g., DynamoDB, MongoDB) databases.
  • Advanced networking skills, including  VPC design, transit gateways, and hybrid cloud connectivity .
  • Expertise in monitoring, logging, and troubleshooting tools like  NewRelic, Prometheus, Grafana, Splunk, CloudWatch , and others.
  • Desired Soft Skills

  • Strategic thinking to design scalable and reliable systems for high-demand entertainment platforms.
  • Strong collaboration and mentorship abilities to guide teams in adopting SRE best practices.
  • Excellent communication skills to work with technical and non-technical stakeholders.
  • Adaptability to a fast-paced, dynamic environment.
  • Nice-to-Have Skills

  • Experience with  AI-powered Operations (AIOps)  to automate troubleshooting and predictive maintenance.
  • Experience in high-traffic or live-streaming applications.
  • Certifications such as AWS Certified Solutions Architect or Azure Solutions Architect Expert.
  • Familiarity with industry-specific compliance standards, e.g., SOC 2, GDPR.
  • If your compensation planning software is too rigid to deploy winning incentive strategies, it’s time to find an adaptable solution. Compensation Planning
    Enhance your organization's compensation strategy with salary data sets that HR and team managers can use to pay your staff right. Surveys & Data Sets

    What is the career path for a Site Reliability Engineer?

    Sign up to receive alerts about other jobs on the Site Reliability Engineer career path by checking the boxes next to the positions that interest you.
    Income Estimation: 
    $103,114 - $138,258
    Income Estimation: 
    $118,163 - $145,996
    Income Estimation: 
    $120,777 - $151,022
    Income Estimation: 
    $129,363 - $167,316
    Income Estimation: 
    $86,891 - $130,303
    Income Estimation: 
    $92,877 - $110,401
    Income Estimation: 
    $120,933 - $155,034
    Income Estimation: 
    $114,618 - $136,401
    Income Estimation: 
    $114,618 - $136,401
    Income Estimation: 
    $144,264 - $191,312
    Income Estimation: 
    $140,435 - $166,410
    Income Estimation: 
    $129,363 - $167,316
    Income Estimation: 
    $145,845 - $177,256
    Income Estimation: 
    $147,836 - $182,130
    Income Estimation: 
    $154,597 - $194,610
    Income Estimation: 
    $86,891 - $130,303
    View Core, Job Family, and Industry Job Skills and Competency Data for more than 15,000 Job Titles Skills Library

    Job openings at hireVouch

    hireVouch
    Hired Organization Address Burbank, CA Full Time
    We are building the world’s highest performance pure-digital AI inference chip. We’re a rapidly growing Toronto-based st...
    hireVouch
    Hired Organization Address Remote, OR Full Time
    Cloud System Engineer Position Overview The Cloud Platform Engineer plays a crucial role in building our next generation...
    hireVouch
    Hired Organization Address Remote, OR Full Time
    Senior DevOps Engineer Position Overview The Senior DevOps Engineer plays a key role in automating our multiple developm...
    hireVouch
    Hired Organization Address Remote, OR Full Time
    We are looking for a Staff ASIC Design Engineer to contribute to a team developing IP for both ASICs and FPGAs. The idea...

    Not the job you're looking for? Here are some other Site Reliability Engineer jobs in the Remote, OR area that may be a better fit.

    AI Assistant is available now!

    Feel free to start your new journey!