What are the responsibilities and job description for the Staff DevOps Engineer- Core Services position at Jewelers Mutual Group?
WHY Jewelers Mutual
Jewelers Mutual is the leading specialty insurance company dedicated to serving the jewelry industry. We've been protecting jewelry and jewelry businesses for over 100 years, and today, we're trusted by thousands of jewelers, manufacturers, and wholesalers, as well as millions of customers, to protect their most valuable jewelry assets.
Summary
As the Senior DevOps Engineer, you will take ownership of optimizing and scaling our serverless platform's infrastructure, CI / CD pipelines, and monitoring capabilities. This is a technical leadership role where you will drive operational excellence, enhance developer productivity, and ensure the resilience, scalability, and security of our microservices platform in a cloud-native environment, particularly AWS. You will collaborate with engineering teams to design, implement, and maintain solutions that support multiple business units, fostering a DevOps-first culture while ensuring seamless integration across our products.
What You'll Do
- Design and Scale CI / CD Pipelines
- Design, implement, and refine CI / CD pipelines to enable automated testing and seamless continuous delivery across multiple environments, leveraging tools such as GitHub Actions, GitHub Advanced Security, or equivalent solutions that align with a "Dev, Test, Prod" workflow.
- Automate and enhance CI / CD pipelines to support microservices deployment with minimal downtime, ensuring seamless integration, rollback capabilities, and rapid iteration cycles.
- Infrastructure as Code and Cloud Management :
- Architect and maintain AWS-based infrastructure using Terraform to ensure security, scalability, and reliability across compute, networking, and data services. Manage core AWS services such as Lambda, API Gateway, Step Functions, VPC, S3, and Aurora Serverless (Postgres).
- Design and implement a robust event-driven communication architecture that enables seamless, scalable, and decoupled interactions across services. Leverage AWS services such as EventBridge, CloudWatch, SNS, and SQS to orchestrate real-time event processing, ensuring high availability, fault tolerance, and responsiveness across the platform
- Observability and Monitoring
- Design and implement a comprehensive observability strategy to ensure real-time visibility into system performance, reliability, and security. Leverage tools like Datadog, AWS CloudTrail, and AWS X-Ray to monitor latency, trace requests across microservices, and detect anomalies before they impact customers.
- Implement proactive monitoring and alerting to track key performance metrics, error rates, and system health, enabling rapid incident detection and response.
- Set up real-time dashboards that provide actionable insights into infrastructure, application performance, and event-driven workflows.
- Establish telemetry and distributed tracing to improve debugging, root cause analysis, and overall system resilience in a highly dynamic, serverless environment.
- Resiliency and Security :
- Design and implement resiliency testing strategies, including chaos engineering practices, to validate fault tolerance and high availability. Lead game days using tools like Gremlin, Chaos Monkey, or AWS Fault Injection Simulator to simulate failures and improve incident response preparedness.
- Ensure robust security practices across AWS resources, including IAM role-based access control, least privilege enforcement, secrets management, and automated compliance checks. Implement best practices for API security, encryption, and AWS-native security services.
- Developer Experience and Automation
- Collaborate with development teams to optimize workflows, automate repetitive tasks, and implement best practices for cloud management, automation, and continuous integration.
- Promote a DevOps-first culture across teams, mentoring others on cloud management, observability, and automation practices.
What We're Looking For
What We Offer :
Great Place to Work® Certified : Join a team recognized for an environment of innovation and growth.