Demo

Site Reliability Engineer

DMSi Software
Omaha, NE Full Time
POSTED ON 1/26/2025
AVAILABLE BEFORE 3/26/2025

*** Applicants must currently reside in the Omaha metro area***

As a Site Reliability Engineer, your primary responsibility will be to review, optimize, and complete the monitoring and alerting systems for our applications. You will work closely with development, operations, and product teams to ensure that our monitoring systems provide clear, actionable data and that our alerting mechanisms are finely tuned to detect issues before they impact our customers. Your work will be pivotal in transforming raw data into actionable intelligence, improving system observability, and enhancing the overall user experience.

RESPONSIBILITIES AND DUTIES:

  1. Monitoring and Observability: Evaluate existing monitoring systems and implement improvements to ensure comprehensive observability across all systems and environments. Develop and maintain dashboards and reports that provide real-time visibility into system health, capacity/utilization trends, and performance.
  2. User Experience: Ensure that the overall system environment operates nominally by monitoring critical performance indicators. Provide insights into system status that help maintain a smooth and uninterrupted user experience.
  3. Alerting Optimization: Review and refine alerting mechanisms to minimize false positives and ensure timely and accurate notifications for critical issues. Develop escalation processes and response playbooks to streamline incident management.
  4. Data Analysis and Insights: Analyze monitoring data to identify trends, anomalies, and potential areas of improvement. Provide actionable insights to relevant teams and drive data-driven decision-making leveraging machine learning and normal versus abnormal system behaviors.
  5. Collaboration: Work closely with software engineers, DevOps teams, and other stakeholders to ensure monitoring and alerting systems are aligned with business goals and technical requirements.
  6. Automation and Tooling: Develop and maintain automation scripts and tools to streamline monitoring and alerting processes, reducing manual effort and improving efficiency.
  7. Documentation and Training: Document monitoring and alerting systems, processes, and best practices. Provide training and guidance to teams on how to use monitoring tools and interpret data.
  8. Continuous Improvement: Continuously assess and improve monitoring and alerting strategies to adapt to changing technologies and business needs. Stay updated with industry trends and emerging tools in the observability space.

KNOWLEDGE, SKILLS, AND ABILITIES:
Strong experience with monitoring and observability tools (e.g., Nagios, Prometheus, Grafana, ELK Stack, Datadog, New Relic).
Proficiency in scripting languages (e.g., Python, Bash, PowerShell) for automation.
Familiarity with cloud platforms (AWS, Azure, GCP) and hybrid cloud environments.
Understanding of infrastructure-as-code tools (e.g., Terraform, Ansible).
Knowledge of CI/CD pipelines and version control systems (e.g., Git, Jenkins).Basic understanding of networking, security, and system administration.

EDUCATION AND EXPERIENCE:
Bachelor's degree in Computer Science, Engineering, a related field, or equivalent experience.
Minimum of 3 years of experience in a Site Reliability Engineering or similar role, with a focus on monitoring and alerting in a SaaS environment.

WORK ENVIRONMENT AND PHYSICAL DEMANDS:
Normal office environment with use of computers and telephone systems; no unusual physical demands
Travel as needed, including business air travel and car rental

 

 

If your compensation planning software is too rigid to deploy winning incentive strategies, it’s time to find an adaptable solution. Compensation Planning
Enhance your organization's compensation strategy with salary data sets that HR and team managers can use to pay your staff right. Surveys & Data Sets

What is the career path for a Site Reliability Engineer?

Sign up to receive alerts about other jobs on the Site Reliability Engineer career path by checking the boxes next to the positions that interest you.
Income Estimation: 
$92,877 - $110,401
Income Estimation: 
$120,933 - $155,034
Income Estimation: 
$114,618 - $136,401
Income Estimation: 
$92,877 - $110,401
Income Estimation: 
$120,933 - $155,034
Income Estimation: 
$114,618 - $136,401
Income Estimation: 
$114,618 - $136,401
Income Estimation: 
$144,264 - $191,312
Income Estimation: 
$140,435 - $166,410
Income Estimation: 
$92,369 - $122,605
Income Estimation: 
$117,024 - $149,811
Income Estimation: 
$76,670 - $90,826
Income Estimation: 
$91,609 - $118,978
Income Estimation: 
$92,877 - $110,401
View Core, Job Family, and Industry Job Skills and Competency Data for more than 15,000 Job Titles Skills Library

Job openings at DMSi Software

DMSi Software
Hired Organization Address Omaha, NE Full Time
The Data Services Analyst leverages a deep understanding of DMSi software products and customer business processes relat...
DMSi Software
Hired Organization Address Omaha, NE Full Time
Are you passionate about building high-quality mobile applications? Do you thrive in an innovative environment where you...
DMSi Software
Hired Organization Address Phoenix, AZ Full Time
*** Applicants must currently reside in the Greater Phoenix area*** DMSi is a software product development company, and ...
DMSi Software
Hired Organization Address Omaha, NE Full Time
As a Quality Engineer, you will play a pivotal role in ensuring that our products/services meet the highest quality stan...

Not the job you're looking for? Here are some other Site Reliability Engineer jobs in the Omaha, NE area that may be a better fit.

Site Reliability Engineer

Dinohead, Omaha, NE

Site Reliability Engineer

TAD PGS, Inc, Omaha, NE

AI Assistant is available now!

Feel free to start your new journey!