What are the responsibilities and job description for the Site Reliability Engineer (SRE) – Kubernetes and Automation Specialist position at Sunrise Group Inc.?
We are seeking a Site Reliability Engineer (SRE) with expertise in Kubernetes and Automation to optimize our client's cloud-based infrastructure. This role focuses on reducing manual interventions, enhancing reliability, and automating operational tasks.Role : Site Reliability Engineer (SRE) - Kubernetes and Systems Automation Specialist
The following information aims to provide potential candidates with a better understanding of the requirements for this role.
Experience : 6-9 Years
Location : Las Vegas, NV
Duration : 6 Month Contract
Key Responsibilities : Kubernetes Management : Deploy and manage Kubernetes clusters, optimize configurations, and implement CI / CD pipelines.
Automation : Develop Infrastructure as Code solutions (Terraform, Ansible) and automate tasks like monitoring and scaling.
Performance Optimization : Set up monitoring systems (Prometheus, Grafana) and conduct root cause analyses to improve reliability.
Collaboration : Work with development teams to design scalable, fault-tolerant systems.
Training : Guide teams on Kubernetes and automation best practices.
Qualifications : Bachelor’s degree in Computer Science, Engineering, or a related field (or equivalent practical experience).
Proven experience managing Kubernetes in production environments.
Proficiency in Infrastructure as Code tools and scripting languages (Python, Bash, Go).
Strong understanding of Linux / Unix systems, networking fundamentals, and cloud platforms (AWS, Azure, GCP).
Preferred : Certified Kubernetes Administrator (CKA) and familiarity with service mesh technologies (e.g., Istio).
Additional Requirements : Must obtain and maintain a valid Nevada Gaming License.