What are the responsibilities and job description for the Site Reliability Engineering Manager position at Litmus7?
Location : San Ramon, CA - ONSITE - NO REMOTE
A Site Reliability Engineer is a professional who acts as a warrior to monitor, protect customer applications, taking charge on
operational tasks to ensure the efficient functioning of a system. They are responsible for monitoring, automating, and
improving the reliability, performance, and availability of any applications.
Mandatory Requirements :
We required the 12 Year profile with e-com application side (Application Reliability side). Infrastructure side profile will not work (System Reliability side)
Must have knowledge of Production Application Support as SRE L2 lead.
Working knowledge in solid leve2 support experience in eCommerce platforms Shopify, Blue Yonder.
Hands on experience in Monitoring, Logging, Alerting, building Dashboard and report generation in any monitoring tools such as AppDynamics / Splunk / Dynatrace / Datadog / CloudWatch / ELK / Prome / New Relic). This engagement is a customer using NewRelic, PagerDuty hence it is good to have this expertise.
Should know how to write SQL query to fetch details from NewRelic, database etc.
Should know how to investigate logs, Leverage basic Java skills to write scripts.
Must have knowledge in ITIL framework specifically on Alerts, Incident, change management, CAB, Production deployments, Risk and mitigation plan.
Should be able to lead P1 calls, brief about the P1 to customer, proactive in gathering leads / customers into the P1 calls till RCA.
Should have knowledge on building and executing SOP, runbooks, handling any ITSM platforms (JIRA / ServiceNow / BMC Remedy
Experience working with postman.
Should have knowledge on building and executing SOP, runbooks, handling any ITSM platforms (JIRA / ServiceNow / BMC Remedy) ? Should know how to work with the Dev team, cross functional teams across time zones.
Should be able to generate WSR / MSR by extracting the tickets from ITSM platforms
Non-Technical Requirement
Ability to clearly communicate and understand a technical idea / concept.
Ability to work in a professional environment while interacting with peers and stakeholders, collaborate with offshore teams.
Excellent written and verbal communications skills.
Motivated, goal driven, influential, innovative, curious, and open minded, fun to work with, collaborator.
Capability to work with people in different time zones.
Ability to operate in a fast-paced, evolving environment and appropriately prioritize tasks, and keep abreast of the latest technology
Collaborate with cloud architecture, infrastructure team, project management team, and technology services, management team.
Create and maintain detailed documentation