What are the responsibilities and job description for the DevOps Engineer position at RIT Solutions, Inc.?
DevOps Engineer -
Englewood, Colorado - Remote
Job Description
We are seeking a Mid-Level DevOps Engineer with Site Reliability Engineering (SRE) experience to contribute to the transition of Crew Management Applications to a web-based SaaS model hosted on AWS. The successful candidate will work under the guidance of a Senior DevOps Engineer, supporting critical system reliability, automation, and monitoring tasks while actively contributing to the successful implementation of key deliverables.
Required Skills -DevOps, Site Reliability Engineering (SRE), Kubernetes, AWS EKS
Job Duties Support Key Deliverables : Assist in implementing metrics collection, developing dashboards, conducting reliability audits, and creating runbooks as outlined in the project goals.
- Collaboration : Work closely with the Senior DevOps Engineer, development teams, and support teams to ensure seamless operations and effective communication between stakeholders.
- CI / CD and Automation : Contribute to the development and optimization of CI / CD pipelines and automation scripts to support efficient and consistent deployments.
- Observability Implementation : Assist in configuring and maintaining monitoring solutions using OpenTelemetry and Grafana to enhance system visibility.
- Production Support : Participate in 24 / 7 Tier II production support on a rotational basis, addressing technical escalations and contributing to system stability.
- Documentation : Collaborate in the preparation of technical documentation, including runbooks, playbooks, and training materials for Tier I and II support teams.
- Dashboards and Metrics : Support the development of Grafana dashboards for monitoring services, including Kubernetes platform components and internally developed services.
- Issue Investigation : Assist in identifying and resolving issues reported from lower-tier support teams, ensuring timely resolution and minimizing customer impact.
- Game Day Scenarios : Participate in the execution of Game Day scenarios to prepare for potential system failures and improve operational readiness.
- Reliability Contributions : Work on tasks related to reliability audits, including submitting merge requests for simpler issues and escalating more complex problems to senior team members.
Job Requirements Experience : 3-5 years in DevOps, SRE, or related roles with a focus on cloud-hosted, microservices-based environments.
Desired Skills & Experience Exposure to GitOps practices and tools like GitLab.