What are the responsibilities and job description for the Site Reliability Engineering Manager position at Medical Mutual?
As a Site Reliability Engineering (SRE) Manager, you will lead a team of talented engineers responsible for ensuring the reliability, scalability, and performance of our systems and services. You will collaborate closely with software development, operations, and product teams to design and implement robust solutions that enhance system reliability and efficiency. Your role will involve managing incident response, driving continuous improvement initiatives, and fostering a culture of reliability and automation within the organization.
- Lead and mentor a team of SREs, providing guidance and support for their professional development.
- Oversee the design, implementation, and maintenance of systems to ensure high availability and performance.
- Collaborate with cross-functional teams to identify and resolve reliability issues.
- Develop and implement strategies for incident management and disaster recovery.
- Drive automation efforts to reduce manual intervention and improve system reliability.
- Monitor system performance and implement improvements based on data-driven insights.
- Foster a culture of continuous improvement and proactive problem-solving.
Responsibilities
- Manages employees and associated budgets who provide technical support for the named areas.
- Supervises the forecasting and capacity projection of hardware to ensure timely acquisition of technology to provide proper availability and performance.
- Interfaces with user and applications areas and provides technical guidance when required.
- Provides input into the development of short- and long-range strategic plans.
- Maintains a high degree of technical expertise to ensure that best practices are utilized in rendering technical support services.
- Acts as technical consultant regarding IT processing activities and assists in the resolution of all technical systems, hardware, and software problems to enhance operations.
Qualifications
Education and Experience
- Bachelor's degree in Computer Science or related field or equivalent, pertinent work experience.
- Eight (8) or more years of experience in a technical support area.
- Three (3) or more years of experience in supervision with proven administrative responsibility
Professional Certification(s)
- Microsoft Certified Systems Engineer
- VMware Certified Professional
Technical Skills and Knowledge:
- Experience with the following: Dell Converged Infrastructure, VMware, EMC, Citrix VDI, Cisco UCS, Dell vBlock, Isilon, Commvault, Disaster Recovery
- Demonstrated ability to manage highly technical people and experience with the listed architectures.
- Functional understanding of project management principles and their application to Infrastructure projects and teams