What are the responsibilities and job description for the Site Reliability Engineer Lead position at Optomi?
Site Reliability Engineer Lead (AWS)
Optomi, in partnership with a leading consulting firm, is seeking an experienced Site Reliability Engineer Lead to join their Bethesda, Maryland office! This role is ideal for professionals with a strong background in AWS infrastructure, automation, and DevOps practices. The SRE Lead will play a crucial role in designing, building, and configuring AWS infrastructure components such as EC2 instances, VPCs, load balancers, and databases to meet business and application requirements. Additionally, the ideal candidate will ensure the infrastructure is secure, scalable, and highly available, while also implementing and maintaining robust monitoring and alerting systems. The SRE Lead will collaborate closely with engineering, development, and operations teams to ensure smooth deployments and ongoing operations.
What the right candidate will enjoy!
- The opportunity to work on highly visible projects within a multi-million dollar company!
- Highly competitive pay and growth opportunities!
- Company culture based on integrity, respect, accountability and excellence!
Experience of the right candidate:
- AWS Expertise: Strong understanding of AWS services (e.g., EC2, VPC, S3, RDS, Lambda).
- Experience with AWS infrastructure automation tools (Terraform)
- Strong analytical and problem-solving skills.
- Ability to identify root causes of issues and implement effective solutions.
- Excellent communication and collaboration skills.
- Ability to work effectively in a team environment.
- Experience with Linux/Unix systems administration.
- Proficiency in scripting languages (e.g., Python, Bash).
- Experience with monitoring tools and alerting systems - Dynatrace & Cloudwatch.
- Experience with CI/CD tools - Jenkins & Harness.
Responsibilities of the right candidate:
- Design, build, and configure AWS infrastructure components (e.g., EC2 instances, VPCs, load balancers, databases) to meet business and application requirements.
- Develop and implement automation solutions using tools like Terraform or Ansible for infrastructure provisioning and management.
- Act as the liaison between the offshore teams and the client.
- Ensure infrastructure is secure, scalable, and highly available on AWS.
- Implement and maintain robust monitoring and alerting systems for AWS infrastructure and applications.
- Analyze performance metrics and identify potential issues proactively.
- Troubleshoot and resolve infrastructure issues, including application performance problems, CI/CD, and outages.
- Participate in incident response and recovery efforts.
- Identify and implement solutions to improve infrastructure efficiency and cost.
- Ensure the security of AWS infrastructure and applications by implementing security best practices and tools.
- Collaborate with other engineering teams, developers, and operations teams to ensure smooth deployments and operations.
- Participate in code reviews and knowledge sharing.
- Perform infrastructure cost analysis and optimization.
- Develop CI/CD solutions for build and deploy management.
- Maintain clear and up-to-date documentation of infrastructure/DevOps configurations and procedures.
Salary : $55 - $65