What are the responsibilities and job description for the DevOps Engineer position at ChabezTech LLC?
Job Title: Site Reliability Engineer (SRE) / DevOps Lead
Location: Remote (Client Location: New Jersey, USA)
Employment Type: Contract (Long-term)
About the Role:
We are seeking a highly skilled and experienced SRE/DevOps Lead for a long-term contract position to support our client based in New Jersey. In this role, you will design, implement, and maintain scalable, secure, and high-performing infrastructure and deployment pipelines. You will collaborate with cross-functional teams to ensure system reliability and operational excellence.
This is a remote role, but candidates must be flexible to work in alignment with EST hours.
Key Responsibilities:
- Design, implement, and manage highly available and scalable cloud-based infrastructure (e.g., AWS, Azure, or GCP).
- Develop, maintain, and enhance CI/CD pipelines to support automated deployments and seamless software delivery.
- Monitor and improve system performance, reliability, and security metrics while ensuring proactive incident management.
- Troubleshoot and resolve complex infrastructure and application issues in a timely manner.
- Implement best practices for Infrastructure as Code (IaC) using tools such as Terraform, CloudFormation, or Ansible.
- Lead efforts to adopt DevOps practices across teams, fostering a culture of collaboration and automation.
- Ensure compliance with industry standards and security policies.
- Document systems, processes, and troubleshooting guidelines for internal use.
Required Skills and Qualifications:
- Experience:
- 7 years of experience in DevOps/SRE roles, with at least 2 years in a leadership or senior capacity.
- Proven track record of designing and managing infrastructure for large-scale, mission-critical systems.
- Technical Expertise:
- Strong hands-on experience with cloud platforms like AWS, Azure, or GCP.
- Expertise in CI/CD tools such as Jenkins, GitLab CI/CD, CircleCI, or equivalent.
- Proficiency in scripting and automation using Python, Bash, or similar languages.
- Deep understanding of containerization and orchestration tools like Docker and Kubernetes.
- Knowledge of monitoring and logging tools such as Prometheus, Grafana, ELK Stack, or Datadog.
- Leadership:
- Proven ability to lead technical teams and mentor junior engineers.
- Strong problem-solving skills with a focus on continuous improvement.
- Other Requirements:
- Excellent communication and collaboration skills.
- Flexibility to work in alignment with EST time zones.
Preferred Skills:
- Certification in AWS, Azure, or Google Cloud.
- Experience with serverless architectures and microservices.
- Familiarity with compliance frameworks such as SOC 2, ISO 27001, or HIPAA.