What are the responsibilities and job description for the Senior Director of Site Reliability Engineering position at Oscar Technology?
Job Details
I'm currently partnering with a client who's a leader in the e-commerce sector, they focus on assisting businesses run effectively and cost-efficiently. They are seeking a Senior Director of Site Reliability Engineering where you will be responsible for shaping the platform reliability vision, and driving continuous improvement through development and operations initiatives. You'll collaborate with teams across the company to ensure the platform remains reliable, scalable, and high-performing.
U.S. Citizen/ Holder ONLY. Must not require sponsorship or have a C2C employer.
Details:
- $230,000 Bonus
- Permanent Role
- 100% Remote
Key Responsibilities:
- Lead DevOps Transformation: Architect and drive the evolution of a cutting-edge DevOps and site reliability practice to support growth and profitability.
- Risk Management & Security: Implement robust security protocols, automate compliance monitoring, and establish DevSecOps practices across the organization.
- Process Optimization & Documentation: Define and enhance DevOps processes, CI/CD pipelines, and infrastructure-as-code practices while creating comprehensive documentation and centers of excellence.
- Strategic Planning & Execution: Develop and execute a DevOps roadmap aligned with business goals, managing project scopes, deliverables, and success metrics.
- Infrastructure Architecture & Platform Engineering: Design scalable, resilient infrastructure architectures and empower development teams with platform engineering capabilities.
- Production Environment Management: Oversee operations across production environments, ensuring reliability, performance, and cost efficiency.
Mandatory Requirements:
- Leadership Experience: Proven ability to build and lead high-performing DevOps and platform engineering teams, fostering a culture of innovation and collaboration.
- Technical Mastery: Expertise in DevOps practices, CI/CD pipelines, Kubernetes, infrastructure-as-code, observability tools, and cloud/ datacenter technologies.
- Communication & Stakeholder Management: Skilled at translating complex technical concepts to diverse audiences and managing cross-functional dependencies.
- Hybrid Infrastructure Experience: Experience managing both datacenter and AWS cloud environment, with a focus on consistent operational practices.
- Operational Excellence: Strong background in 24x7 operational models, on-call rotations, incident management, and SRE practices with a focus on automation.
- Release Engineering Expertise: Experience building automated, reliable deployment pipelines and implementing GitOps and progressive delivery techniques.
- Strategic Vision: Ability to align DevOps strategy with business goals, translating vision into actionable roadmaps.
- Monitoring & Observability Leadership: Expertise in modern monitoring, distributed tracing, and implementing observability as a core practice.
- Team Development: Proven success in recruiting, mentoring, and developing DevOps and SRE teams with a focus on continuous learning.
- Customer-Centric Approach: Experience establishing SLAs/SLOs and building dashboards to ensure system health and customer satisfaction.
- Project Management: Strong skills in planning, prioritization, and executing complex technical projects on time and within budget.
Oscar Associates Limited (US) is acting as an Employment Agency in relation to this vacancy.