What are the responsibilities and job description for the Site Reliability Engineer position at Wheeler Staffing Partners?

Site Reliability Engineer

Location : Fully Remote (Priority to Candidates in TX, Open to VA, NV, FL, PA, NJ, MO, NC)

Employment Type : Direct Hire

Salary : Up to $150,000 (Flexible Based on Experience)

Sponsorship : Client cannot sponsor or work with C2C

About the Role

Wheeler Staffing Partners is seeking a Site Reliability Engineer (SRE) for our client. This role is critical to maintaining and enhancing the reliability, scalability, and performance of mission-critical systems. The ideal candidate has hands-on experience with AWS cloud environments, Infrastructure as Code (IaC), automation tools, containerized workloads, and monitoring systems .

The SRE will work closely with Engineering teams to identify and resolve performance bottlenecks, optimize system reliability, and drive automation for infrastructure operations.

This is a fully remote role , but priority will be given to candidates located in Texas . Candidates in VA, NV, FL, PA, NJ, MO, and NC will also be considered.

Key Responsibilities

Ensure System Reliability & Performance – Maintain and improve the uptime, performance, and scalability of cloud and on-premise infrastructure.

Develop & Automate Processes – Build and implement automation tools for monitoring, deployment, and incident response to reduce manual interventions.

Monitor & Troubleshoot Issues – Use observability tools to proactively detect, diagnose, and resolve infrastructure and application performance issues.

Optimize Cloud & On-Prem Infrastructure – Manage and optimize AWS cloud environments , ensuring cost efficiency and security .

Enhance Disaster Recovery & Resilience – Implement backup strategies, failover systems, and incident response protocols to minimize downtime and data loss.

Collaborate with DevOps & Engineering Teams – Work closely with software engineers, data scientists, and IT teams to enhance system architecture and streamline deployment pipelines .

Security & Compliance – Ensure system security, data integrity, and compliance with regulatory requirements.

Capacity Planning & Scaling – Analyze system performance trends and plan for future scalability needs.

Incident Management & Post-Mortems – Lead incident response efforts , document root causes, and implement preventive measures.

Continuous Improvement – Identify bottlenecks and inefficiencies in infrastructure and implement best practices to enhance reliability.

Required Qualifications

Bachelor’s degree in Computer Science, Information Technology, or a related field (may consider equivalent experience).

3 years of experience in a Site Reliability Engineering (SRE), DevOps, or Infrastructure Engineering role .

Strong expertise in AWS cloud services and Infrastructure as Code (IaC) tools , including :

AWS Cloud Development Kit (CDK)

AWS CloudFormation

Experience with CI / CD tools , such as :

Jenkins, GitHub Actions

Proficiency in containerization and orchestration tools like :

Docker

Strong understanding of :

Load balancers, REST APIs, networking (IP management, subnetting), HA architecture

Serverless cloud computing models

Proficiency in cloud monitoring and observability tools , such as :

AWS CloudWatch, EFK Stack, OpenTelemetry, Datadog, Grafana, New Relic

Ability to define and track golden metrics and establish meaningful alerting thresholds.

Strong analytical skills with experience in root cause analysis and incident management .

Excellent communication and collaboration skills to work across teams.

Preferred Qualifications

Cloud-related certifications such as :

AWS Certified DevOps Engineer

Certified Kubernetes Administrator (CKA)

Experience with Agile methodology or willingness to learn.

Why This Role?

Fully remote opportunity with priority given to Texas-based candidates.

High-impact role in maintaining and scaling mission-critical infrastructure.

Competitive salary with flexibility based on experience.

Opportunity to work with cutting-edge cloud technologies, automation tools, and monitoring platforms .

Collaborate with engineering teams to drive innovation in system reliability.

Salary : $150,000

Apply for this job

Receive alerts for other Site Reliability Engineer job openings

Site Reliability Engineer

What are the responsibilities and job description for the Site Reliability Engineer position at Wheeler Staffing Partners?

What is the career path for a Site Reliability Engineer?

Job openings at Wheeler Staffing Partners

Not the job you're looking for? Here are some other Site Reliability Engineer jobs in the Dallas, TX area that may be a better fit.

We don't have any other Site Reliability Engineer jobs in the Dallas, TX area right now.

AI Assistant is available now!