What are the responsibilities and job description for the Operations Engineer position at IRIS Consulting Corporation?
IRIS Consulting Company is a trusted leader in providing IT staffing needs to our clients. With offices in the Minneapolis/St. Paul and Atlanta metro, we have built solid business relationships with our clients in the airline, manufacturing, insurance, healthcare, and tech industries. Full SDLC support means we get to know our clients and our candidates to find not just a match, but a true fit on both sides. With over 25 years of experience, we can truly deliver.
Operations Support Engineer
π Location: Minneapolis, MN (Hybrid β 2-3 days per week in-office)
π« Travel: 0% client travel required
π On-Call Rotation: One week per month (off-business hours support can be remote from home)
Position Summary
The Operations Support Engineer ensures the stability, reliability, and continuous improvement of our platform through effective incident management, triage, and change management processes. This role provides 24x7 on-call support, conducts Root Cause Analyses (RCAs), and collaborates with cross-functional teams to implement necessary changes and enhancements.
The engineer will generate comprehensive reports to provide insights into platform operations and work closely with client operations teams, product architecture, and platform solutions for disaster recovery planning and execution. The ideal candidate is proactive, has excellent communication skills, and is passionate about ensuring seamless platform operations.
Key Responsibilities
- Incident Management & Support
- Provide 24x7 on-call support (one week per month).
- Manage tier 1 operations, escalating critical issues as needed.
- Lead Root Cause Analyses (RCAs) and conduct retrospectives for Sev 1 & Sev 2 incidents.
- Resolve escalated tickets from client operations teams and escalate complex issues appropriately.
- Monitoring & Reporting
- Monitor platform health and proactively enhance stability.
- Develop dashboards, monitors, and alerts using Grafana, AppDynamics, Splunk, Elastic, CloudWatch.
- Generate monthly operational reports and SSL certificate expiration reports (bi-monthly).
- Provide insights to optimize platform performance and reliability.
- Collaboration & Process Improvement
- Work with DevOps, client operations teams, and product architecture for documentation and process enhancements.
- Assist in change management to ensure smooth deployments and minimal disruptions.
- Manage OpsGenie configuration and escalation trees.
- Support SOC audit compliance and security reviews.
Required Skills & Qualifications
β Incident Management & Triage Experience (24x7 support, escalations)
β Root Cause Analysis (RCA) & troubleshooting experience
β Monitoring & Alerting Tools (Grafana, AppDynamics, Splunk, CloudWatch)
β Basic Maintenance Coding skills
β Strong problem-solving & collaboration abilities
β Flexible & adaptable to new technologies
Preferred Skills
- Windows & Linux server infrastructure experience
- AWS Operations experience
- Corporate License Management knowledge
- Agile Kanban methodology experience
- Familiarity with Git, Jira, and Confluence
Education & Minimum Qualifications
π Bachelorβs degree in a relevant field or equivalent experience
π‘ Strong communication, problem-solving, and organizational skills
π’ Equal Opportunity Employer β We encourage applications from all backgrounds, including individuals with disabilities and veterans.