What are the responsibilities and job description for the Site Reliabitily Engineer position at Chabez Tech LLC?

Job Details

Job Title: Site Reliability Engineer (No DevOps Profile)

Location: Atlanta, GA (Hybrid 3 Days Onsite)

Experience: 13 Years

Job Summary:

We are seeking a highly skilled Site Reliability Engineer (SRE) with 6-10 years of experience to manage and enhance the reliability, performance, and scalability of enterprise infrastructure. The ideal candidate will have expertise in Kubernetes (K8s), Envoy, REST/gRPC/HTTP, OTEL, Networking, Python, Observability, RAG, and LLM. This role requires a proactive approach to troubleshooting, process improvement, and infrastructure automation.

Job Summary:

Seeking a Site Reliability Engineer (SRE) to improve infrastructure reliability, performance, and scalability. Requires expertise in Kubernetes (K8s), Envoy, OTEL, Networking, Python, Observability, RAG, and LLM, along with troubleshooting, process optimization, and automation skills.

Key Responsibilities:

Resolve incidents within SLA and escalate critical issues.
Perform alert analysis and enhance monitoring.
Troubleshoot networking, Kubernetes, and application issues.
Strengthen observability (Prometheus, Datadog, Grafana).
Document and improve SOPs.
Mentor team members and assist with migrations.
Ensure ITSM compliance and automate infrastructure using Terraform, Ansible, CloudFormation.

Key Skills & Experience:

8 years in SRE.
5 years with Kubernetes (K8s), Envoy, OTEL, Networking, Python, Observability, RAG, LLM.
Strong REST/gRPC/HTTP API and container orchestration experience.
Proficiency in Terraform, Ansible, CloudFormation, and CI/CD pipelines (Jenkins, GitHub Actions, GitLab CI, ArgoCD).
Expertise in troubleshooting distributed systems and ITIL-based incident management.
Cloud experience (AWS, Azure, Google Cloud Platform).
Strong communication and collaboration skills.

Client Needs:

Experienced SRE with expertise in Kubernetes, networking, observability, and automation (Terraform, Ansible). Strong troubleshooting and process improvement skills.

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.

Apply for this job

Receive alerts for other Site Reliabitily Engineer job openings

Job openings at Chabez Tech LLC

Artificial Intelligence (AI) Consultant - Portland, OR - Onsite

Chabez Tech LLC

Portland, OR Full Time

Job Description Title: Artificial Intelligence (AI) Consultant Location: Portland, OR - Hybrid - [Prefer Local] Duration...

Cognos Developer - W2 basis

Chabez Tech LLC

Atlanta, GA Full Time

Job Description Title: Cognos Developer Location: Atlanta, GA - Onsite Duration: Long Term Looking for consultant to wor...

Python Developer - W2 Basis

Chabez Tech LLC

Atlanta, GA Full Time

Job Description Title: Python Developer Location: Atlanta, GA - Onsite Duration: Long Term Looking consultant to work on...

Salesforce Consultant

Chabez Tech LLC

Atlanta, GA Full Time

Job Details Job Title : Salesforce Consultant Location: Atlanta- GA (Must be local) Experience Required : 14 Years Role ...

Not the job you're looking for? Here are some other Site Reliabitily Engineer jobs in the Atlanta, GA area that may be a better fit.

Engineer

DC Career Site, Decatur, GA

Site Reliabitily Engineer

What are the responsibilities and job description for the Site Reliabitily Engineer position at Chabez Tech LLC?

Job Details

What is the career path for a Site Reliabitily Engineer?

Job openings at Chabez Tech LLC

Not the job you're looking for? Here are some other Site Reliabitily Engineer jobs in the Atlanta, GA area that may be a better fit.

We don't have any other Site Reliabitily Engineer jobs in the Atlanta, GA area right now.

AI Assistant is available now!