Demo

Senior Systems Reliability Engineer

CHESS Solutions, LLC
Herndon, VA Full Time
POSTED ON 3/3/2025
AVAILABLE BEFORE 6/3/2025

WHO WE ARE

Chess Solutions, LLC is a Virginia-based government contracting company that builds digital vetting tools. We work collaboratively with a number of clients to develop tools to facilitate client goals and mission objectives. In conjunction with our parent company, Presage Technologies, we integrate state-of-the-art vision-based physiological analysis tools and digital media forensic tools into robust software platforms. Our goal is to provide our clients with the most accurate digital vetting and analysis tools possible in an easy-to-use, modern software experience.

WHAT YOU’LL DO

As a Senior Reliability Engineer specializing in AWS Cloud technologies, you will work closely

with our cross-functional engineering team to design, build, and maintain the cloud

infrastructure, automation, and backend services that power our products.

This role requires technical versatility —you will work across a range of architectures ,

including serverless , containerized, and VM-based environments. You’ll also play a key role

in ensuring the scalability, security, and performance of our systems while improving

developer workflows and automation. You will be responsible for building DevOps and MLOps infrastructure to support machine learning and software development as well as provide general network and systems administration support to users.

You are a strong collaborator and communicator. You are able to plan and estimate your time, are self-directed in development, and you communicate dependencies well in advance. You are able to build rapport and trust with customers, translate customer requirements into roadmap items, and develop consensus on prioritization across a wide set of customer constituencies. You are a bug hunter and default to system issues being in your part of the stack, and you expect others to operate similarly. You understand the value of unit tests, CI / CD pipelines, and establishing quality assurance metrics and processes.

You write excellent documentation at all phases of a project. You plan well to communicate intent and design, welcoming others to provide feedback and input into your project planning. You understand there are multiple levels of documentation to produce, including for internal development, external integrators, system security plans and compliance, and end-users.

This is a hybrid role , with an expectation of working in-office or collaborating with customers / teammates in Northern Virginia (Herndon / Leesburg) or St. Paul, Minnesota.

KEY RESPONSIBILITIES

DevOps & Infrastructure

  • CI / CD Automation : Design and maintain pipelines for mobile applications, backend services, and machine learning workflows to ensure fast, reliable deployments.
  • Infrastructure as Code (IaC) : Implement and manage AWS infrastructure using CloudFormation, Terraform, Helm, and Docker Compose .
  • Optimization : Continuously monitor and optimize performance, cost, and reliability of applications.
  • Observability : Deploy monitoring, logging, and alerting solutions to track system performance and detect anomalies.
  • Developer Experience : Enhance development workflows by improving CI / CD and infrastructure automation .
  • Troubleshooting : Investigate and resolve infrastructure and backend issues to ensure smooth deployments.

Cloud Architecture & Security

  • AWS System Design : Architect, develop, and manage applications using AWS ECS, AWS EKS, Lambda, and related services .
  • Security & Compliance : Implement AWS security best practices, IAM policies, and compliance controls.
  • Networking : Manage networking and VPN solutions such as Tailscale for secure access.
  • Documentation : Create clear and comprehensive technical documentation to support system maintenance and knowledge sharing.
  • Machine Learning Operations (MLOps)

  • Model Deployment : Create our workflow for the deployment, monitoring, and maintenance of ML models in production.
  • Scalability : Ensure seamless model integration, efficient resource utilization, and high availability .
  • Backend Development

  • API Development : Design and build backend services and APIs using Python, Go, or Node.js .
  • Database Management : Work with NoSQL databases such as Amazon DynamoDB and MongoDB .
  • Code Quality : Write clean, maintainable, and efficient code , conducting code reviews to uphold quality standards.
  • WHO YOU ARE

    You are a self-driven, adaptable engineer who thrives in a fast-paced startup environment. You enjoy solving complex problems, collaborating with a team, and continuously learning new technologies .

    To succeed in this role, you should have :

  • 7 years of professional experience working with AWS cloud technologies .
  • Strong experience with Infrastructure as Code (IaC), particularly AWS CloudFormation .
  • Proficiency in Python (or an equivalent programming language).
  • Experience delivering and maintaining highly available distributed systems .
  • Proficiency in GitLab system administration and CI / CD pipeline management .
  • Experience developing serverless applications using AWS SAM .
  • Familiarity with NoSQL databases such as Amazon DynamoDB and MongoDB.
  • Strong understanding of AWS best practices, security, cost optimization, and performance tuning .
  • Excellent problem-solving, debugging, and troubleshooting skills .
  • Strong teamwork and communication skills , with the ability to work in a collaborative, remote-first environment.
  • Eligibility for a U.S. Top Secret Clearance is required. Candidates with an existing, current clearance will be given preference. The U.S. Government prohibits non-U.S. citizens from obtaining Top Secret Clearances
  • If you're driven by the challenge of tackling deepfake threats and passionate about pushing the boundaries of technology, we encourage you to apply for the Senior Systems Reliability Engineer role at Chess Solutions LLC. Join our innovative and collaborative team, where you'll work on real-world problems, grow alongside experts in media forensics, and contribute to a mission with global impact.

    INTERVIEW PROCESS

  • Phase 1 : Submit a Resume
  • Phase 2 : Downselection for Introductory Interview with Operations Leadership with Q&A
  • Phase 3 : Downselection for Senior Systems Reliability Engineer Background Interview
  • Phase 4 : Downselection for Senior Systems Reliability Engineer Performance Task
  • Phase 5 : Negotiations and Offer
  • Timeline : The entire process can occur in less than two weeks for the right candidate.

    If your compensation planning software is too rigid to deploy winning incentive strategies, it’s time to find an adaptable solution. Compensation Planning
    Enhance your organization's compensation strategy with salary data sets that HR and team managers can use to pay your staff right. Surveys & Data Sets

    What is the career path for a Senior Systems Reliability Engineer?

    Sign up to receive alerts about other jobs on the Senior Systems Reliability Engineer career path by checking the boxes next to the positions that interest you.
    Income Estimation: 
    $92,877 - $110,401
    Income Estimation: 
    $120,933 - $155,034
    Income Estimation: 
    $114,618 - $136,401
    Income Estimation: 
    $92,877 - $110,401
    Income Estimation: 
    $120,933 - $155,034
    Income Estimation: 
    $114,618 - $136,401
    Income Estimation: 
    $114,618 - $136,401
    Income Estimation: 
    $144,264 - $191,312
    Income Estimation: 
    $140,435 - $166,410
    Income Estimation: 
    $140,435 - $166,410
    Income Estimation: 
    $151,875 - $212,356
    Income Estimation: 
    $169,957 - $202,398
    Income Estimation: 
    $169,957 - $202,398
    Income Estimation: 
    $151,875 - $212,356
    Income Estimation: 
    $120,143 - $165,703
    View Core, Job Family, and Industry Job Skills and Competency Data for more than 15,000 Job Titles Skills Library

    Job openings at CHESS Solutions, LLC

    CHESS Solutions, LLC
    Hired Organization Address Herndon, VA Full Time
    WHO WE ARE Chess Solutions, LLC is a Virginia-based government contracting company that builds digital vetting tools. We...

    Not the job you're looking for? Here are some other Senior Systems Reliability Engineer jobs in the Herndon, VA area that may be a better fit.

    Enterprise Release Manager - Systems Engineer - Senior #2037

    Systems Engineer - Senior #2037 - COMPASS, Inc., Springfield, VA

    AI Assistant is available now!

    Feel free to start your new journey!