What are the responsibilities and job description for the Site Reliability Engineer position at Vuesol Technologies Inc?
Site Reliability Engineer (SRE)
Required Skills:
· 12 years overall experience
· Hands-On proficiency in at least one high-level language (Java (must), NodeJS, Kotlin, Python, Go) (3-4 yrs)
· Hands-On experience with automated testing tools (JMeter, Junit, Mockito, Postman)
· Hands-On experience with a source code management system like GIT or SVN including pull, push, branch, commit and merge functions
· Hands-On experience creating, configuring and maintaining cloud-based applications and infrastructure for the rapid development and monitoring of applications and services. (AWS, EC2, Fargate, Cloud Formation, RDS, Elastic Cache, S3)
· Experience with Cloud Migrations with reliability and availability as core focus
· Experience in implementing the SRE at the team/enterprise level with hands-on implementation of SRE practices and improving the metrics
· Hands-On experience with monitoring tools (Splunk, Dynatrace, NOI) and dashboard development including development and customization of dashboards
· Hands-On experience with the build, deploy, and packaging process and best practices. Familiar using DevOps automation tools (UCD, Jenkins, Maven, SonarQube, Chef, Ansible, Puppet)
· Scripting skills for automation (Linux bash and Windows)
· Experience with network implementations
· Hands-On experience in developing/implementing SRE reliability practices as part of Microservices delivery to Cloud
General Required Skills:
· Ability to diagnose and optimize software code for reliability and resiliency
· Knowledge of the incident management process and reporting tools (ServiceNow, Jira Service Desk)
· Good communication and documentation skills. An SRE must document their work, collect and document "tribal knowledge” (the good stuff in people's head), and make it accessible to others.
· Good knowledge in building the frameworks and guiding teams in increasing SRE practice adoption
· Experience triaging incidents and conducting RCAs (Root Cause Analysis)