What are the responsibilities and job description for the Senior Site Reliability Engineer position at Culver Careers (CulverCareers.com)?
Join Our Global Team and Shape the Future of Technology!
Who We Are:
We are a global leader in consumer networking, electronics, enterprise solutions, and software services, with dual headquarters driving our worldwide impact. Our mission is to connect people globally through technology by creating reliable, high-performance products. Our commitment to professionalism, innovation, excellence, and simplicity empowers clients to achieve success and consumers to enjoy a seamless lifestyle. Our new global R&D Center in California fuels our innovations in next-gen networking, IoT smart home products, and transformative software services.
We're looking for a passionate and experienced Sr. Site Reliability Engineer to join our team and play a crucial role in ensuring our cloud platform's security, reliability, scalability, and operational excellence.
Position: Sr. Site Reliability Engineer
Responsibilities:
- Technical SME: Implement and operate Microservices on Kubernetes cloud-based platforms.
- Collaborate: Work with Cloud Technical Development and DevOps teams to deploy services to the Multi-Cloud Platform.
- Testing: Perform Load Tests and Chaos Tests for scalability and reliability.
- Build Observability: For Microservices and cloud platforms like AWS, OCI, Azure, and GCP.
- Disaster Recovery: Write and execute plans in collaboration with Development and DevOps teams.
- Risk Analysis: Resolve production risks related to resources such as node groups, CPU, memory, HPA scheduling, JVM pre-warming, etc.
- Automation: Write and maintain scripts using Python, Go, or Bash.
- Define KPIs: Maintain SLA/SLO/SLI for cloud microservices with development teams.
- Technical Documentation: Create and maintain architecture diagrams, design documents, and SOPs.
- Security & Compliance: Ensure adherence to standards like ISO27001, SOC2, and GDPR.
- Incident Response: Lead efforts to troubleshoot and resolve production issues.
- Post-Incident Analysis: Identify root causes and potential workarounds/solutions.
- POCs: Assist with product/technology selection and implementation.
- Mentorship: Mentor and train junior team members.
- On-Call Support: Provide support after hours and on weekends as part of the rotation.
- Other Duties: As assigned.
Requirements:
- Education: Bachelor’s degree in Computer Science, Information Technology, or related field.
- Experience: 5 years as a Site Reliability Engineer.
- Skills:
- Proficiency in Java, Python, Bash, or PowerShell.
- Hands-on experience in SRE, DevOps, cloud operations, and security best practices.
- Strong knowledge of security technologies (Identity and Access Management, Network Security, Application Security, Data Protection).
- Strong problem-solving and analytical skills.
- Experience in technical documentation and compliance implementation.
- Visa candidates are not being accepted at this time
Additional Skills (Preferred):
- Cloud Certifications: AWS Solutions Architect Professional, Azure Solutions Architect Expert, GCP Professional Cloud Architect.
- Container Orchestration: Experience with technologies like Kubernetes.
Compensation & Benefits:
- Salary: $130,000-$180,000 base plus bonus.
- Benefits: Comprehensive package including 100% employer-paid health coverage, generous PTO, and 401k with company match.
Ready to Transform the Tech Landscape?
Apply now and be part of a leading company with endless opportunities for growth and innovation.
Salary : $130,000 - $180,000