What are the responsibilities and job description for the Major Incident and Event Manager position at cloudingest inc?
Job Details
Title: Major Incident and Event Manager
Job-Location: , California, United States.
Work Style: Hybrid Work setting 3 days/week in the office is required.
Position Overview and Responsibilities:
The Major Incident and Event Manager will be reporting to the Manager, ITSM and manage all of the client s IT related Critical Incidents and Event management activities.
The Major Incident and Event Manager applies understanding and knowledge of information systems products and services to assist in the management of Major Incidents and Event Management.
This role will ensure the integration, correlation, and consolidation of events across domains is standardized and centralized in the global event management platform with respect to the published architectural and process standards.
Assist the users and colleagues in resolving all outage related problems and questions.
The Major Incident and Event Manager must have superlative written and oral communications skills and must have proven record of high level of work quality.
Must have a valid California Driver s license.
Responsibilities and Key Deliverables:
Undertakes immediate efforts to ensure effective and rapid response and restoration (Crisis/ P1 / P2)
Advocate for Tier 2 and Tier 3 technical teams, and business units
Researches, identifies, and proposes viable solutions for major incident process
Perform incident management functions per Information Technology Infrastructure Library (ITIL) and serves as the incident owner throughout the lifecycle
Research issues and escalations, convening escalation bridges with appropriate Tier 2 and Tier 3 groups as necessary
Develops, tracks, and presents key Incident Management metrics
Deconstructs major incidents to identify issue lifecycle versus root cause
Coordinates identification and resolution of major incidents with resolvers
Obtains and documents accurate updates on the work being done to resolve the outage
Documents/updates appropriate communications, phone portals and service portals wherever applicable during outage
Coordinates the logistics around and conducts related audits of major incidents, including sample selection, documentation, and communication of results.
Ensure compliance with requirements, processes, and procedures. Ensures timely completion, management, and control of deliverables.
Ensure conformance to and provides high level of expertise on incident tool(s), knowledge management tool(s) and quality management tool(s), processes, and procedures
Perform as technical evaluator for support plans and Knowledge Articles for known issues. Reviews and makes recommendations of improvements to knowledge management documentation.
Contributes analysis and documentation to Known Error Database
Interprets and implements incident standards and requirements
Adheres to and maintains high levels of expertise in all incident management support processes, procedures, and expectations established by management.
Assists with the updating of SOPs, work instructions, checklists, and various other documents.
Accountable for supporting the strategic planning and design of the Monitoring & Event Management framework.
Ensure the integration, correlation, and consolidation of events across domains is standardized and centralized in the global event management platform (AIOps) with respect to the published architectural and process standards.
Identify opportunities for standardization and process improvement, with goal of enhancing the customer experience.
Proactively collaborate with all service owners (esp. CX, Domains and Managed Service Providers) to ensure that the event management framework meets the expectations of all key stakeholders, creates value, and drives effective decision-making and continuous improvement of services and service components.
Proactively identifies training opportunities to execute on the organization s overall goals
Meets or exceeds all Goals and Objectives and Service Level Targets
Provides input to senior team members regarding outage related actions/activities
Work on-call hours that would include 24/7 coverage per the SOPs
Qualifications and Key Skillset for this Role
Eight (8) or more years of Critical Incident Management experience with at least five (5) years in a ITIL Event Management role with a focus on protect, detect, and respond in addition to the following:
8 years of experience in Critical Incident Management
5 years of experience in ITIL Event Management
Demonstrated experience using ServiceNow ITSM (Incident, Major incident and Event Management) products
A solid understanding of ITSM with practical experience designing, implementing, and supporting ITIL improvements
Bachelor s degree in Computer Science, Information Management or similar technical field from an accredited institution required.
Significant experience may be considered in lieu of degree: In lieu of a Bachelor s degree, a minimum of sixteen (12) years of relevant work experience is required.
Certifications:
ITIL v3 or higher
ITSM Certifications
Ideal Resource for this Role
Strong Knowledge of the following:
Major ITSM processes including Critical Incident management, Problem management, Event Management and Request Management
Current business practices and computing systems, IT development methodologies and operations.
Program and project management and planning, process mapping.
Healthcare issues, information systems, management issues, and current trends.
Conceptualizing business strategies while implementing information systems and technology strategic direction.
Highly tenacious, combined with high stress resistance
Uses logic, methods, and tools to solve problems with effective solutions
Ability to coordinate and drive conference calls
Excellent organizational and time management skills
Displays basic Project and Problem Management skills and abilities
Ability to recognize errors and correct to meet organizational standards
Ability to troubleshoot problems and work with other groups to find solutions
Extremely detail oriented
Capability of multi-tasking, managing multiple events simultaneously
Proven ability to analyze and report on various levels of data and metrics
Ability to follow outlined processes and procedures
Ability to speak and communicate effectively and in an articulate and diplomatic manner across all levels of the organization
Ability to follow verbal and written instructions
Ability to work independently with little supervision
Be a subject matter expert with a hands-on approach in a complex fast-paced business environment.
Present issues and challenges in senior management forums.
Work with a team of professionals from various disciplines.
Lead through times of change, disruption, and growth.
Strong planning, organization, critical thinking, decision-making and communication (verbal and written) skills.
The ability to work as a member of a team, willing to be flexible and adaptable to change in a dynamic work environment, and the ability to learn and apply new concepts.