What are the responsibilities and job description for the Director of Site Reliability - Remote position at Saige Partners LLC?
Senior Director of Site Reliability Engineering
Who We Are Looking For
The Senior Director of Site Reliability Engineering will be responsible for the reliability and stability of our platform. This newly created role will define and implement a platform reliability vision , driving continuous improvement through development, operations, and process excellence.
This position has full solid-line responsibility for operations of platform offerings , including deployment, management, monitoring, reporting, troubleshooting, and repair of production systems. It also has direct oversight of site reliability engineers and influence over software engineers in product development. Collaboration with product management and customer success teams will be key to ensuring all platform offerings meet or exceed business objectives and customer expectations.
Key Responsibilities
- Define and execute a state-of-the-art site reliability vision , improving data center and disaster recovery capabilities.
- Proactively assess and manage risks , identifying and mitigating potential issues in operations and releases.
- Develop and maintain process documentation for all applications, continuously improving tools and methodologies.
- Plan and execute high-quality, on-time, and on-budget projects aligned with SRE objectives.
- Lead system architecture improvements for enhanced performance and scalability.
- Oversee all components of production data centers and public cloud regions , managing customer-facing applications and websites.
Who You Are
You have…
Proven Leadership Skills – Ability to manage teams and drive cross-functional initiatives.
Technical Expertise – In-depth knowledge of servers, databases, operating systems, networks, monitoring tools, and cloud services.
Strong Communication – Ability to convey technical concepts effectively to both technical and non-technical stakeholders.
Conflict Resolution Skills – Experience handling operational challenges in fast-paced environments.
E-Commerce & Cloud Experience – Hands-on experience in self-hosted or co-located data center environments, as well as cloud-based platforms.
Mature Release Engineering Knowledge – Expertise in automated CI / CD pipelines for both on-premises and cloud environments.
Monitoring & Management Frameworks – Deep understanding of observability best practices in cloud infrastructures.
Team & Change Management – Experience leading SRE and development teams while executing structured change management in high-availability environments.
Customer-Centric Mindset – Commitment to delivering outstanding customer experiences and satisfaction.
What We Offer
If you are looking for a high-growth, high-impact role where you can drive innovation in a marketplace business , we encourage you to apply.