What are the responsibilities and job description for the Principal Site Reliability Engineer position at Fidelity Investments?

Job Description :

Position Description :

Delivers services at high scale, high availability with resilience by using automation and Infrastructure Code. Builds reliability into ecosystem by applying best practices in Resiliency Engineering, Automation, Observability, and Chaos Testing. Manages systems using infrastructure as code tools (IAM, ARM, Terraform, and Chef). Utilizes modern monitoring tools (Datadog, Prometheus, and Splunk). Automates with various scripting languages - Python and Shell scripting. Helps teams scale through production insights, operational automation, developer guidance, real-time metrics, and automation.

Primary Responsibilities :

Performs Instrumentation with systems skills on building and operating, monitoring, logging, and alerting services of distributed systems at scale.
Maintains scalability and resiliency in complex environments.
Implements advanced observability practices and techniques at scale.
Triages and executes root cause analysis.
Manages and interprets large datasets using query languages and visualization tools.
Communicates with both technical and non-technical audiences.
Presents new software, methods and practices to developers.
Works with a variety of individuals and groups in a constructive and collaborative manner; and builds and maintains effective relationships.
Applies Cloud Computing and DevOps concepts including continuous integration and continuous delivery (CI / CD) pipelines in system and infrastructure maintenance.

Education and Experience :

Bachelor's degree (or foreign education equivalent) in Computer Science, Engineering, Information Technology, Information Systems, Mathematics, Physics, or a closely related field and five (5) years of experience as a Principal Site Reliability Engineer (or closely related occupation) designing, building, deploying, and maintaining infrastructure and applications in Cloud providers - Amazon Web Services (AWS) and Azure.

Or, alternatively, Master's degree (or foreign education equivalent) in Computer Science, Engineering, Information Technology, Information Systems, Mathematics, Physics, or a closely related field and three (3) years of experience as a Principal Site Reliability Engineer (or closely related occupation) designing, building, deploying, and maintaining infrastructure and applications in Cloud providers - Amazon Web Services (AWS) and Azure.

Skills and Knowledge :

Candidate must also possess :

Demonstrated Expertise ("DE") designing, building, and deploying the open-source Envoy Gateway Infrastructure as an Edge and Internal Gateway to Kubernetes platform running in on-premises and public Cloud ensuring robust and scalable solutions for diverse deployment scenarios.

DE implementing end-to-end CI / CD pipelines in automated testing and deployment processes, including building efficient and reliable software delivery pipeline incorporating industry-leading security best practices throughout the development and deployment lifecycle.

DE automating infrastructure provisioning and configuration through Terraform for diverse Cloud platforms, including AWS, AWS GovCloud, and OCI; and applying Infrastructure as Code (IaC) principles using tools (Ansible, Chef, and Puppet) to enhance efficiency and maintainability.

DE setting up comprehensive monitoring and logging solutions using observability tools (Datadog, Splunk, and ELK) to track system performance and proactively identify issues through log analysis and troubleshooting; implementing tracing mechanisms and providing detailed view of API request flows for effective debugging, performance optimization, and API communication.

PE1M2

Certifications : Category :

Information Technology

Fidelity's hybrid working model blends the best of both onsite and offsite work experiences. Working onsite is important for our business strategy and our culture. We also value the benefits that working offsite offers associates. Most hybrid roles require associates to work onsite every other week (all business days, M-F) in a Fidelity office.

Apply for this job

Receive alerts for other Principal Site Reliability Engineer job openings

Principal Site Reliability Engineer

What are the responsibilities and job description for the Principal Site Reliability Engineer position at Fidelity Investments?

What is the career path for a Principal Site Reliability Engineer?

Job openings at Fidelity Investments

Not the job you're looking for? Here are some other Principal Site Reliability Engineer jobs in the Roanoke, TX area that may be a better fit.

We don't have any other Principal Site Reliability Engineer jobs in the Roanoke, TX area right now.

AI Assistant is available now!