What are the responsibilities and job description for the Principal, Site Reliability Engineer position at Sysco LABS?

Job Summary

Impactful changes across the platform and sustained leadership roles. Responsible for designs and future direction for high availability, performant web/mobile applications, resilient and scalable systems, and metrics and monitoring. Responsible for defining best practices across development, product, architecture, and leadership to collaborate and mentor reliability across the platform. Forward thinking and action to be ahead of issues before they occur through automation and careful analysis. Critical thinking and debugging skills of highly complex environments including networking packet analysis, kubernetes, nginx, streaming (kafka), edge networks, caching, and application layer generalist. Fully accountable for overall system reliability and performance.

Duties And Responsibilities

Develop and refine strategy and process for all reliability tracking across the platform in conjunction with senior members of the team.
Lead strategic discussions to continue the evolution of flexibility and sustainability of the entire product suite.
Partner with support teams, DevOps, Engineering, and customers to inform decisions and implement improvements.
Responsible for RCA findings related to reliability are addressed at initial injection to prevent regression.
Looking broadly across the platform for latent reliability issues and address them before they are surfaced.
Provide the orchestration for the production environment by monitoring availability and taking a holistic view of system health
Architect the software and systems to manage platform infrastructure and applications.
Documenting and performing annual reviews for tribal knowledge and best practices.
Define the objectives for system performance, with an eye toward pushing our capabilities forward, getting ahead of customer needs, and innovating to continually improve
Gather and analyze metrics for trending performance tuning and fault finding
Partner with development teams to improve services through rigorous testing and release procedures
Provide leadership for system design, platform management, and capacity planning
Balance feature development speed and reliability with well-defined service level objectives
Actively maintain a thorough understanding of system architecture, applications, and related integrations. Partner with the Platform team to understand and improve system monitoring and alerting.
Drive active-active multisite reliability targets.
Ability to drive performance and reliability in a multi-cloud environment.
Implement Enterprise level procedures and processes.
Hands on experience with the top Cloud providers.

Education Required

Bachelor’s degree in computer science, computer engineering or related field, or relevant training.

Education Preferred

Or equivalent combination of experience and education.

Experience Required

8 years experience in Site Reliability Role.

8 years experience with enterprise cloud platforms.

Availability to work extended or off-cycle hours and participate in a 24/7 Site Reliability on-call rotation.

Experience Preferred

8 years’ experience in cloud operations / DevOps role.

Experience with AWS.

Experience with APM tools such as DataDog, New Relic, Nagios or Splunk.

Experience in an agile development environment.

Physical Demands

Reasonable accommodations will be made to enable individuals with disabilities to perform the essential functions of this job.

Apply for this job

Receive alerts for other Principal, Site Reliability Engineer job openings

Principal, Site Reliability Engineer

What are the responsibilities and job description for the Principal, Site Reliability Engineer position at Sysco LABS?

What is the career path for a Principal, Site Reliability Engineer?

Job openings at Sysco LABS

Not the job you're looking for? Here are some other Principal, Site Reliability Engineer jobs in the Houston, TX area that may be a better fit.

We don't have any other Principal, Site Reliability Engineer jobs in the Houston, TX area right now.

AI Assistant is available now!