What are the responsibilities and job description for the Senior Site Reliability Engineer position at The Intersect Group?
To create and maintain the next generation of application infrastructure and to be responsible for reliability, automation and scalability using and the latest best practices.
Essential Functions
- Implement software and tools to improve the performance - availability, scalability, and latency, while delivering end products to customer with the highest efficiency and meeting all security standards.
- Build automation and tooling around application management, such as deployments, configuration changes and disaster recovery scenarios.
- Implement and evangelize Observability and monitoring systems to proactively detect problems and identify cause.
- Evaluate capacity of the application on a continuous basis to provide stats to the Product/Business teams and recommend an efficient path to scale for future needs.
- Identify performance bottlenecks and work with cross-functional teams to troubleshoot and resolve issues.
- Implement standards across multiple disciplines, systems and practices to improve the overall application delivery.
- Work directly with application development teams to provide feedback and technical requirements to the software development lifecycle, implementing best-practice microservice design patterns and other modern software development approaches.
- Serve as a technical liaison for the application and provide documents and runbooks to Level 1 and Level 2 teams.
- Participate in 24 X 7 on-call rotation.
- Be a champion of excellent processes; take the initiative in developing repeatable patterns and standard, re-usable work across teams.
- Support the company's commitment to protect the integrity and confidentiality of systems and data.
Minimum Qualifications
- Education and experience typically obtained through completion of a Bachelor’s Degree in Business and/or Computer Science or related field.
- 3 years of related experience managing large complex projects in a technical or software development environment inclusive of post-graduate degree
- Demonstrated experience in effective Incident and Problem Management
- Proven related work experience in a medium to large scale enterprise.
- Strong understanding of scripting languages
- Hands on experience implementing and using modern Observability solutions.
- Linux systems administration
- Good knowledge of Git
- Experienced with security and encryption protocols.
- Comfortable with facilitating collaboration, open communication and reaching across functional borders.
- Excellent oral and written communication and people skills.
- High level of customer responsiveness, excellent documentation and communication skills and attention to detail.
- Background and drug screen.
Preferred Qualifications
- Good programming skills in one or more of the following languages: Java, ruby, python, JavaScript and GO
- Hands-on experience in supporting applications in a 24X7 customer-facing production environment.
- Strong knowledge of CI/CD workflows
- Strong understanding and hands-on experience on TCP/UDP/IP protocols
- Working knowledge of AWS, Docker, Kubernetes, Swarm
Salary : $170,000 - $180,000