What are the responsibilities and job description for the Application Monitoring Engineer position at RIT Solutions, Inc.?
I need someone with STRONG AWS, Python, Datadog.
Remote, 12 month contract
Job Description :
The Application Monitoring Engineer is responsible for delivering and supporting enterprise level full stack monitoring focused on the user experience. This consists of the ability to monitor components such as integrations, external services, infrastructure components, transactions, and business activities from applications that may be on-prem and in the cloud (AWS, GCP, Azure). As part of the Enterprise Monitoring & Alerting Team, this position will be involved in crafting and implemented the Monitoring Strategy and tool usage documentation for application monitoring using Datadog, Dynatrace, and other enterprise monitoring toolsets (Splunk, ELK, Cloud Watch, and others). This Individual will work primarily with Datadog. They will also collaborate with application teams and other stakeholders in the monitoring space to gather monitoring requirements and deliver / maintain solutions.
Job Responsibilities :
- Support application monitoring for .Net, Java, SaaS, AWS, and other application architectures based in the cloud (AWS, GCP, Azure) using Datadog.
- Understand the differences between PaaS, IaaS, and SaaS and best practices for the monitoring of each in the cloud
- Understand log analytics monitoring within Datadog and ELK
- Monitor the responsiveness and availability of critical websites and web applications from the end-user perspective
- Monitor infrastructure component for CPU, memory, disk space, and I / O
- Provide deep visibility into on-prem, cloud and hybrid applications performance
- Implement application monitoring, synthetic and real user monitoring solutions following best practices across the enterprise
- Collaborate with business and technology to design and implement performance benchmarks for each application, and report results periodically
- Develop processes to monitor and alert for critical business transactions and applications to catch an issue before the impact
- Participates in the discussion of applications performance and infrastructure outage incidents to provide solutions
- Collaborate with all relevant IT resources to develop preventive measures and automated remediation of production issues
- Participate in continuous improvement initiatives to enhance client service and efficiency
Top Skills Details
1. 3 Years with Application Performance Monitoring Experience (ideally some DataDog and ELK)
2. 3 Years with Enterprise Application Support Experience - Java, .NET, SAAS, AWS, GCP, AZURE
3. 3 Years with SW, Web or Distributed Application Architecture Experience - Strong understanding of Logs, Log Analytics, Log Monitoring etc.