What are the responsibilities and job description for the Dynatrace Engineer position at Tech Mahindra (Americas) Inc.?
Job Details
Role: Dynatrace Engineer
Location: Fort Worth, TX 76131 Onsite
Face to Face Interview must after 1st round
An Operational Awareness or Monitoring and Alerting Engineer is a specialized IT professional responsible for the design, implementation, and management of monitoring and alerting systems for an organization's IT infrastructure. Their primary goal is to ensure the continuous availability, reliability, and performance of critical systems and applications. By leveraging various monitoring tools and technologies, they proactively identify and address potential issues before they impact business operations.
Key Responsibilities:
- System Monitoring: Implement and maintain monitoring solutions to track the performance, health, and availability of IT systems, applications, and networks.
- Alert Management: Configure and manage alerting mechanisms to ensure timely notifications of any anomalies, failures, or performance degradations.
- Incident Response: Collaborate with support and operations teams to analyze, resolve, and lead event resolution processes during incidents and outages.
- Root Cause Analysis: Conduct thorough investigations to determine the root cause of incidents and implement corrective actions to prevent recurrence.
- Optimization: Identify opportunities for system optimization and performance improvements through data analysis and trend identification.
- Tool Evaluation and Integration: Evaluate, recommend, and integrate new monitoring and alerting tools and technologies to enhance the organization's monitoring capabilities.
- Documentation and Reporting: Develop and maintain comprehensive documentation, including monitoring configurations, incident reports, and performance metrics.
- Collaboration and Communication: Work closely with various IT teams, including application, infrastructure, and DevOps teams, to ensure seamless operations and effective communication during incidents.
Skills and Qualifications:
- Proficiency in monitoring and alerting tools (e.g., Dynatrace, Datadog, CloudWatch, Splunk).
- Strong understanding of IT infrastructure, including servers, networks, databases, and cloud environments.
- Some Experience with incident, problem, and change management processes a plus
- Ability to analyze complex systems and identify performance bottlenecks.
- Excellent troubleshooting and problem-solving skills.
- Effective communication and collaboration skills.
- Familiarity with ITIL best practices and service management frameworks.
Performance of Duties:
- Operate in a 7-day/24-hour environment with after-hours support flexibility.
- Collaborate with internal teams and suppliers to resolve and lead event resolution across all mission-critical IT and Telecom service levels.
- Protect business system availability through integrated incident, problem, and change management.
- Monitor systems for faults and optimization opportunities.
- Assist the major incident response team and escalate critical events.
- Evaluate and improve monitoring/alerting tools and processes.
- Conduct technical root cause analysis and engage with management teams for internal issues.
- Identify potential business-impacting events and manage incident processes.
- Provide expert guidance during reviews and debriefs.
- Analyze problem trends and monitor tools to identify chronic activity.
- Communicate effectively with senior management.
Qualifications:
- Experience with Dynatrace, AppMon, Zabbix, SCOM, Datadog, CloudWatch, X-Ray, and Splunk.
- Self-motivated and able to work in a 7x24 environment.
- Experience managing critical system outages and interacting at all organizational levels.
- On-call support availability.
Preferred Qualifications:
- B.S. degree in Computer Science, Information Systems, or Engineering.
- Technical expertise in distributed systems/administration and general scripting/programming (Python, Node.js, Ruby, Perl, Bash/sh).
- Excellent writing and communication skills.
- ServiceNow experience.
The pay range for this role is $90,000 - $110,000 per annum including any bonuses or variable pay. Tech Mahindra also offers benefits like medical, vision, dental, life, disability insurance and paid time off (including holidays, parental leave, and sick leave, as required by law). Ask our recruiters for more details on our Benefits package. The exact offer terms will depend on the skill level, educational qualifications, experience and location of the candidate.
Tech Mahindra is an Equal Employment Opportunity employer. We promote and support a diverse workforce at all levels of the company. All qualified applicants will receive consideration for employment without regard to race, religion, color, sex, age, national origin or disability. All applicants will be evaluated solely on the basis of their ability, competence, and performance of the essential functions of their positions with or without reasonable accommodations. Reasonable accommodations also are available in the hiring process for applicants with disabilities. Candidates can request a reasonable accommodation by contacting the company ADA Coordinator at .
Salary : $90,000 - $110,000