What are the responsibilities and job description for the Site Reliability Engineer (SRE) and AppDynamics Monitoring Specialist position at Accord Technologies Inc?
Location: Piscataway,NJ
Mandatory skills . Site Reliability Engineer (SRE) with expertise in AppDynamics and monitoring solutions to join our dynamic team. CI/CD
Candidate should be having experience on Banking or in Insurance domain
Job Summary: As an SRE and AppDynamics Monitoring Specialist, you will be responsible for ensuring the reliability, availability, and performance of our applications and infrastructure. You will leverage AppDynamics to monitor applications, troubleshoot issues, and optimize performance. You will collaborate with development and operations teams to implement best practices in monitoring, incident management, and performance tuning.
Key Responsibilities
Mandatory skills . Site Reliability Engineer (SRE) with expertise in AppDynamics and monitoring solutions to join our dynamic team. CI/CD
Candidate should be having experience on Banking or in Insurance domain
Job Summary: As an SRE and AppDynamics Monitoring Specialist, you will be responsible for ensuring the reliability, availability, and performance of our applications and infrastructure. You will leverage AppDynamics to monitor applications, troubleshoot issues, and optimize performance. You will collaborate with development and operations teams to implement best practices in monitoring, incident management, and performance tuning.
Key Responsibilities
- Monitoring and Performance Management:
- Implement and manage AppDynamics monitoring solutions for applications and infrastructure.
- Analyze application performance metrics and identify areas for improvement.
- Create dashboards and reports to visualize application performance and health.
- Incident Management:
- Respond to incidents in a timely manner, coordinating with development and operations teams.
- Conduct root cause analysis for incidents and implement corrective actions.
- Document incidents and resolutions for future reference.
- Automation and Tooling:
- Develop and maintain automation scripts to streamline monitoring and incident response processes.
- Collaborate with DevOps teams to integrate monitoring solutions into CI/CD pipelines.
- Collaboration and Support:
- Work closely with development teams to ensure applications are designed for reliability and performance.
- Provide support and training to team members on monitoring tools and practices.
- Capacity Planning and Optimization:
- Analyze system capacity and performance trends to forecast future needs.
- Optimize applications and infrastructure for cost-effectiveness and performance.
- Documentation and Best Practices:
- Create and maintain documentation for monitoring processes, incident response procedures, and system architecture.
- Promote best practices in application monitoring and reliability engineering.
- Bachelor’s degree in Computer Science, Information Technology, or related field.
- 3 years of experience in Site Reliability Engineering, Application Monitoring, or related fields.
- Proficiency in AppDynamics or similar monitoring tools (e.g., New Relic, Dynatrace).
- Strong understanding of application architecture, cloud environments, and microservices.
- Experience with automation tools and scripting languages (e.g., Python, Bash).
- Familiarity with containerization and orchestration technologies (e.g., Docker, Kubernetes) is a plus.
- Excellent problem-solving skills and ability to work under pressure.
- Strong communication and collaboration skills.
- Experience with incident management tools (e.g., PagerDuty, ServiceNow).
- Knowledge of CI/CD practices and tools.
- Understanding of networking concepts and protocols.