What are the responsibilities and job description for the Digital Site Reliability Engineer position at United Software Group, Inc.?
Job Description :
We are seeking a highly skilled and experienced Reliability Engineer to join our team. The ideal candidate must have a strong background in technology, with specific expertise in Kubernetes, Gitlab, Dynatrace, GraphQL, Node, React with a good understanding of CI / CD pipelines. The candidate must be comfortable with ambiguity, learning new things and have a perseverance similar to "if at first I don't succeed, try and try again"
Responsibilities :
Collaborate with cross-functional teams to develop and maintain release architectures and monitor frameworks.
Provide system design consulting and critical support to the development team prior to program launch.
Identify and solve sophisticated performance and scaling issues, working with engineers to avoid bottlenecks and meet traffic demands.
Mentor and guide team members, helping them grow in their roles.
Identify and implement automation and monitoring tools to improve the efficiency and effectiveness of SRE processes.
Take ownership of any critical incidents and work towards timely resolution and prevention of future occurrences.
Mandatory Requirements :
Five (5) to Seven (7) years of professional experience in technology or a related field.
Two (2) years of experience with Kubernetes / EKS
Two (2) years of experience with CI / CD pipelines.
Two (2) years of experience with a sophisticated observability platform including RUM and APM.
Good To Have Requirements
Familiarity with reading and understanding JavaScript (Node.JS).
Capabilities utilizing Dynatrace APM and RUM (other APM or RUM may be applicable) - Dynatrace Associate Certification is a plus.
Intermediate to Advanced skills in BASH shell scripting, Python and Docker
Intermediate skills with on-prem Gitlab CI pipeline creation, troubleshooting, and configuration of Gitlab CI.
Preferred Qualifications :
Solve sophisticated performance and scaling issues, working with engineers to ensure that we avoid bottlenecks and meet traffic demands through organic growth and marketing events.
Strong problem-solving skills and the ability to work in a fast-paced environment.
Communicate effectively with stakeholders, including management, to provide updates, recommendations, and solutions for any SRE-related issues.
Excellent communication and collaboration skills.
Experience with Kubernetes / EKS and pod life cycle management including readiness and liveness checks.
Experience with building and supporting CI / CD pipelines and production releases.
Working knowledge of complex CDN cached website architecture.
Keep a pulse on the job market with advanced job matching technology.
If your compensation planning software is too rigid to deploy winning incentive strategies, it’s time to find an adaptable solution.
Compensation Planning
Enhance your organization's compensation strategy with salary data sets that HR and team managers can use to pay your staff right.
Surveys & Data Sets
What is the career path for a Digital Site Reliability Engineer?
Sign up to receive alerts about other jobs on the Digital Site Reliability Engineer career path by checking the boxes next to the positions that interest you.