What are the responsibilities and job description for the Reliability Engineer position at Ardent Services?
Why do you need to choose between doing important work and having a fulfilling life? At Ardent , we have both. Ardent employees are committed to solving our customers' most difficult problems-and we are committed to the well-being, personal goals, and professional development of our employee. We are "All In." We put forth our strongest effort possible to get the mission accomplished and we do it together. We respect the skills and experience you bring to the Ardent team. And we provide a rewarding environment to help you succeed.
We offer highly competitive benefits, professional development opportunities, and an exceptional culture that embraces flexibility, innovation, collaboration, and career growth. A collective service mindset underpins our work, and a shared camaraderie to serve clients, colleagues and our communities set us apart. Our full commitment to being "All In" for our employees and our clients is not just our approach, it is our standard. If this sounds like the perfect fit for you, choose Ardent and make a difference with us.
Ardent is seeking a Reliability Engineer to join our team.
This is an onsite role in Ashburn, VA.
Position Description :
We are seeking a skilled Reliability Engineer to support our client's mission by enhancing Production Monitoring and ensuring optimal service delivery for their applications. This role involves proactive issue identification, incident resolution, and system health optimization within a 24x7x365 operational environment. The ideal candidate will lead monitoring solutions, manage ITIL engineers, automate processes, and collaborate across IT and business teams to improve service reliability. Expertise in AWS environments, root cause analysis, and technical troubleshooting is essential, along with strong communication and leadership skills to drive continuous improvement.
Requirements :
- Experience in Production Monitoring & Support within a 24x7x365 operational environment.
- Strong expertise in incident management, root cause analysis, and problem resolution for cloud-based applications.
- Hands-on experience with Amazon Web Services (AWS) and cloud-based monitoring tools.
- Proficiency in ITIL processes and managing ITIL engineers for efficient service delivery.
- Ability to build and implement monitoring solutions, automate manual processes, and create alerts to ensure system stability.
- Experience with system health monitoring, performance optimization, and troubleshooting production issues.
- Strong leadership skills to collaborate with IT, business, and infrastructure teams to improve production support processes.
- Effective communication skills to provide updates, incident reports, and status updates to leadership and stakeholders.
- Ability to develop and maintain technical documentation and knowledge base resources for production support.
- Experience in triaging and resolving production incidents, assessing severity, and properly escalating issues.
Responsibilities and Duties :
Customer Facing :
Optimizes Work Processes :
Collaborates :
Communicates Effectively :
Active CBP / BI or Top Secret clearance is highly desired. Must be open to working 2nd or 3rd shift in a 24 / 7 / 365 environment.
Due to the nature of the work we support, all candidates in consideration for this role must be U.S. Citizens willing to undergo the government issued background investigation process.
Ardent is an equal opportunity employer. We will not discriminate and will take affirmative action measures to ensure against discrimination in employment, recruitment, advertisements for employment, compensation, termination, upgrading, promotions, and other conditions of employment against any employee or job applicant on the bases of race, color, gender, national origin, age, religion, creed, disability, veteran's status, sexual orientation, gender identity or gender expression