What are the responsibilities and job description for the Lead Site Reliability Engineer position at Motion Recruitment?
Our client, a leading Marketing Technology company, is seeking a Lead Site Reliability Engineer (SRE) to join their team. This role is a hybrid position based in Boston with a requirement of two days onsite per week. The ideal candidate will have extensive experience in designing, deploying, and managing scalable cloud infrastructure while improving system reliability, performance, and efficiency.
Key Responsibilities
Posted By: Matthew Durkin
Key Responsibilities
- Lead and mentor a team of Site Reliability Engineers to ensure high availability and performance.
- Design, implement, and maintain cloud infrastructure on AWS.
- Develop and manage Infrastructure as Code (IaC) using Terraform, Helm, and Kustomize.
- Deploy, monitor, and optimize containerized applications using Kubernetes, KOps, and Istio.
- Automate and streamline operational processes with scripting and programming in Go and Python.
- Improve system observability and incident response using monitoring, logging, and alerting tools.
- Collaborate with cross-functional teams to drive best practices for reliability, security, and scalability.
- Troubleshoot and resolve complex infrastructure and application issues.
- 8 years of experience in a Site Reliability Engineering or DevOps role.
- Strong expertise in Kubernetes and associated tools (KOps, Istio, Helm, Kustomize).
- Proficiency in cloud platforms, particularly AWS.
- Experience with Go and Python for automation and infrastructure management.
- Hands-on experience with Terraform for infrastructure provisioning.
- Deep understanding of monitoring, logging, and alerting best practices.
- Strong problem-solving and troubleshooting skills in distributed systems.
- Ability to thrive in a hybrid work environment with 2 days onsite in Boston.
- Experience in the Marketing Technology industry.
- Familiarity with service mesh technologies like Istio.
- Expertise in CI/CD pipelines and deployment automation.
- Strong understanding of networking, security, and cloud cost optimization.
- Competitive Compensation and benefits package.
- Hybrid Work Model – collaborate in person and remotely.
- Exciting Challenges – work on cutting-edge cloud and container technologies.
- Career Growth – opportunity to lead and shape the SRE function.
Posted By: Matthew Durkin