What are the responsibilities and job description for the Monitoring Consultant (Solar Winds/Ansible) position at S R INTERNATIONAL INC?
State of PA - Monitoring Consultant (Solar Winds / Ansible) 756032 (Local Only / Hybrid)
DESCRIPTION OF DUTIES :
- Responsible for functioning as the Technical SME on monitoring tools and processes
- Responsible for collaborating with technical specialists, agency teams, and vendors to implement actionable monitoring and reporting.
- Technical SME for management and continuous improvement of key enterprise-wide systems and processes, including changes, incident reporting, and problem resolution .
- Responsible for implementations of products / services that involve significant Commonwealth oversight.
- Interpret, process, and report data to create meaningful business and operational dashboards.
- Maintain (patch, troubleshoot) existing and future monitoring tools including System Center Operations Manager, SolarWinds, SightLine, and SquaredUp.
- Identifies improvements to existing processes and tools to achieve high quality services / products.
- Create Azure Monitor resources and Log Analytics queries.
- Create, document, and maintain on-prem and cloud automations.
- Create, document, and maintain SOAP / REST / JSON / API calls using PowerShell or other compatible languages.
- Maintain and troubleshoot monitoring tool connectivity to endpoints.
- Creates documentation for new processes
- Updates documentation for existing process
- Documents incidents and problems impacting monitoring services.
- Collaborate with the enterprise change manager to ensure processes are standardized and documented workflows are followed.
- Develop and maintain standard operating procedures (SOPs)
- Collaborate with the Enterprise Incident Manager to ensure that standardized SOPs and processes are consistently applied across incident and problem management.
- Monitor incident and problem resolution processes to ensure timely and effective service restoration and root cause analysis.
- Manage and document the operational procedures and responses of NOC teams to service delivery and incident management.
- Ensure all processes and workflows are documented in an accessible, organized, and secure manner for future reference.
- Establish and maintain Standard Operating Procedures (SOPs)for all relevant operational processes.
- Emphasize the transition from informal, person-dependent workflows to formal, role-driven processes.
- Develop and document a process documentation workflow that ensures all operational procedures are captured and updated regularly.
- Ensure that consistent and clear communication processes are in place for changes, incidents, and problem management across the NOC.
- Create and manage distribution lists for technical and non-technical stakeholders (ETSO) to ensure relevant parties are informed of NOC updates.
- Enable self-management of distribution lists via subscription options to streamline communication across the organization.
- Work closely with NOC staff to ensure effective communication regarding change, incident, and problem management on behalf of NUTSO.
- Ensure collaboration between different departments to harmonize efforts in incident, problem, and change communication.
- Complies with and develops recommendations for executive public and enterprise policy objectives as it relates to the delivery of Commonwealth IT services.
- Utilizes the Service Now Change management tool to input request for changes.
- Directs the development of policies and procedures consistent with Commonwealth standards and direction.
- Participates in Enterprise change management meetings for enterprise level service configuration and access changes for all supported locations is not impacted.
- Provides on-going data submissions regarding network availability, problem resolution and infrastructure enhancements for use in compilation of the monthly / quarterly customer Service Level Agreement (SLA) reports.
- Designs agency disaster recovery plans for the network infrastructure and participates in periodic plan updates and testing exercises.
- Reviews technical manuals and other literature, attends seminars, conferences, and training classes to maintain currency with new information services, products, and information technology developments in network technology.
- Performs other related duties as assigned, to include those outlined in the CoG Plan when the Plan is activated.
- Responds to the designated alternate or secondary location when directed in response to a catastrophic incident.
- This position is expected to adhere to established organizational service management processes and procedures.
- SolarWinds admin / deployment experience
- Ansible admin / deployment experience
- Experience of Log Analytics Azure experience
- MS Windows Server admin / deployment experience
- Linux Server admin / deployment experience
- PowerShell scripting experience
- Incident Management Experience
Skills Required :
Salary : $50 - $60