What are the responsibilities and job description for the Site Reliability Engineer (SRE) - Secret Clearance Required position at Paradyme Management?
Site Reliability Engineer (SRE) - Secret Clearance Required
Job Locations
US-VA-Tysons
Job ID
2025-2420
Type
Full-Time
Overview
Paradyme Management is a rapidly growing government technology leader that puts service first, for its customers, its team and the communities it supports. Paradyme harnesses DevSecOps and Agile development processes to deliver exceptional results for digital transformations. With headquarters office in Tysons Corner, VA, Paradyme's award-winning culture sets it apart through its team's deep commitment to service and collaboration with its customers, each other and the community. Learn more at www.paradymemanagement.com.
Responsibilities
Paradyme has partnered with an industry leader in enterprise Artificial Intelligence software and is seeking a highly skilled Site Reliability Engineer (SRE) to join our team to manage, monitor, and optimize our C3 clusters on Kubernetes. Together we're accelerating our client's digital transformation through the building and deployment of data-driven, scalable AI solutions. The ideal candidate will have a deep understanding of Kubernetes, Cloud Infrastructure, and Infrastructure as Code (IaC) practices. You will be responsible for ensuring the reliability, scalability of our Kubernetes clusters and Cloud Infrastructure
Responsibilities
Monitor and Manage Kubernetes Clusters : Ensure the stability, health, and scalability of Kubernetes Clusters, deploying applications and services on Kubernetes.
- Kubernetes Management : Deploy, monitor, and scale applications on Kubernetes clusters. Maintain Helm charts, manage services, and ensure resource allocation for optimal cluster performance.
- Cloud Infrastructure Management : Work with leading Cloud Platforms (AWS, GCP, Azure) to set up, configure, and manage infrastructure resources using Infrastructure as Code (Terraform, CloudFormation, etc.).
- Monitoring & Incident Response : Set up monitoring solutions, define alerts, and manage the incident response process for any issues related to Jenkins, C3, or Kubernetes clusters.
- Automate Infrastructure Processes : Build automation tools for scaling, monitoring, and maintaining infrastructure using modern tools like Terraform, Ansible, or equivalent.
- Collaborate Across Teams : Work closely with development, services, and operations teams to ensure a seamless integration between application development and infrastructure.
- Security & Compliance : Ensure all systems follow best practices in terms of security and compliance with relevant regulations. This includes role-based access, encryption, and automated vulnerability scanning.
Requirements
Physical Requirements : These are the essential physical requirements needed to successfully perform the job.
Requires sitting up to 8 hours per day.
Paradyme Management, Inc. is committed to the full inclusion of all qualified individuals. In keeping with our commitment, Paradyme will take the steps to ensure that people with disabilities are provided reasonable accommodations. Accordingly, if a reasonable accommodation is required to fully participate in the job application or interview process, to perform the essential functions of the position, and / or to receive all other benefits and privileges of employment, please contact Rose Luczak, Director of People Operations at rose.luczak@paradyme.us or at (571) 289-0548
EEO Statement
Paradyme is a federal contractor and an EEO and an Affirmative Action Employer. All employment decisions shall be made without regard to age, race, creed, color, religion, sex, national origin, pregnancy-related disability, physical or mental disability, genetic information, sexual orientation, marital status, familial status, personal appearance, occupation, citizenship, veteran or military status, gender identity or expression, or any other characteristic protected by federal, state or local law.