What are the responsibilities and job description for the Site Reliability Engineer (SRE) position at Karwell Technologies Inc?
Job Details
#W2 Requirement
Job Title: Site Reliability Engineer (SRE)
Location: Atlanta, GA(Need only Locals)
Job Responsibilities:
- Design, implement, and maintain scalable and reliable systems to ensure high availability and performance.
- Monitor system performance and troubleshoot issues to ensure optimal operation.
- Collaborate with development teams to integrate reliability into the software development lifecycle.
- Automate repetitive tasks and processes to improve efficiency and reduce manual intervention.
- Develop and maintain documentation for system architecture, processes, and procedures.
- Participate in on-call rotations to provide support for production systems.
- Conduct post-mortem analyses of incidents to identify root causes and implement preventive measures.
- Stay updated with industry trends and best practices in site reliability engineering.
Requirements:
- Bachelor's degree in Computer Science, Engineering, or a related field.
- Proven experience as a Site Reliability Engineer or in a similar role.
- Strong knowledge of cloud computing platforms (e.g., AWS, Azure, Google Cloud).
- Proficiency in scripting and programming languages (e.g., Python, Go, Bash).
- Experience with containerization and orchestration technologies (e.g., Docker, Kubernetes).
- Familiarity with monitoring and logging tools (e.g., Prometheus, Grafana, ELK stack).
- Excellent problem-solving skills and the ability to work under pressure.
- Strong communication and collaboration skills to work effectively with cross-functional teams.
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.