What are the responsibilities and job description for the E01 Site Reliability Engineer II position at EXPANSIA?
Start Date: Immediate
EXPANSIA is a service-disabled, veteran-owned small business (SDVOSB) delivering exceptional strategy and technology integration services to the U.S. Federal Government. We support several Department of Defense (DoD) and Federal Agencies across the CONUS.
OVERVIEW
Full-time/Permanent Employee
Location: Dayton, OH
As a Site Reliability Engineer II, you will work under general supervision to ensure the reliability and performance of systems and applications by addressing technical challenges of moderate scope and complexity. You will leverage your expertise in software development, database management, and container orchestration to evaluate system components, predict failures, and recommend enhancements. Collaborating with cross-functional teams, you will play a key role in improving the reliability of products and processes while supporting the organization's technological infrastructure.
The proposed salary range for this position is $79,000-$146,000. There are a host of factors that can influence final salary including, but not limited to, Federal Government contract labor categories and contract wage rates, relevant prior work experience, specific skills and competencies, geographic location, education, and certifications. Our employees value the flexibility EXPANSIA allows them to balance quality work and their personal lives. We offer competitive compensation, benefits and learning and development opportunities. Our unique mix of benefits options is designed to support and protect employees and their families. Employment benefits include health and wellness programs, income protection, paid leave and retirement and savings.
\n- Evaluate and analyze products, components, materials, and equipment to predict and address potential failures
- Review product designs, material specifications, and manufacturing capabilities to ensure reliability and dependability
- Create prototypes and conduct product tests to gather and interpret reliability data
- Recommend product design changes or alterations in manufacturing processes to achieve required reliability levels
- Monitor production system diagnostics and maintenance records to predict and prevent downtime
- Document findings, including results of root cause analysis, and implement necessary changes to maintain product and equipment reliability
- Work with engineering and development teams to design and implement reliability improvements
- Determine maintenance requirements and schedules for products and equipment
- Monitor failure data generated by customers and propose product improvements
- Review subcontractor reliability programs and provide evaluations for decision-making
- Conduct Continuous Integration/Continuous Delivery (CI/CD) pipeline testing and optimization to ensure robust software delivery
- Develop and manage container orchestration platforms such as Kubernetes (K8s) to enhance system scalability and reliability
- Ensure 100% of planned hours are worked and recorded
- Identify and forward to your leadership any opportunities that could lead to growth within your work area
- Ensure all contractual deliverables are met/exceeded to the customer's satisfaction
- Completes personal PDP and attend Staff Meeting and Storytime (with camera on)
- Within your program, build productive and positive professional relationships with clients
- Performs other related duties as assigned
- Clearance: Active TS/SCI
- Education and Years of Experience: Bachelor’s degree (or equivalent experience) with 2-4 years of experience, or a Master’s degree with 2 years of experience. 6 years of experience without a degree. A minimum of 2 years of specialized experience is required.
- Proficiency in one or more programming languages such as JavaScript or Python
- Has one of the following certifications; CompTIA Cloud , CompTIA Security , GICSP, SSCP, or GSEC
- Experience in database management or system administration
- Familiarity with CI/CD pipeline technologies and DevOps practices
- Hands-on experience with containerization technologies and orchestration platforms like Kubernetes (K8s)
- Strong analytical and problem-solving skills
- Excellent communication skills to collaborate effectively with cross-functional teams
- Strong experience using quantitative measure (e.g. metrics collection) and visualization (e.g. charts and graphs) to inform system readiness
- DevOps Institue (or similar) Certified SRE Practitioner certification
- Certified Kubernetes Administrator (CKA) certification
- Experience with reliability engineering methodologies, including statistical distributions and reliability models
- Familiarity with root cause analysis techniques and reliability testing frameworks
- Knowledge of predictive maintenance strategies and tools
- Ability to assess and refine system architecture for improved performance and reliability
EXPANSIA is an Equal Opportunity Employer – Females/Minorities/Protected Veterans/Individuals with Disabilities
Salary : $79,000 - $146,000