What are the responsibilities and job description for the Site Reliability Engineer position at Mumba Technologies, Inc.?
Site Reliability Engineer
Full Time Permanent
Onsite in Colorado Springs, CO
Responsibilities
- Participate in designing software solutions that achieve the best customer value
- Translate customer and business requirements into functional and technical solutions and provide input on the Supportability of functional requirements and designs
- Lead ongoing operational support initiative to ensure day-to-day Web operations are running smoothly and manage issue escalations.
- Schedule and conduct operational reviews to ensure SLA are aligned with business goals and objectives.
- Strong ability to troubleshoot and drive issues to closure. Previous experience in managing a full ecosystem would be an added bonus (across all layers like web , app , db , third party integrations etc )
- Exceptional ability and strong desire to lead and mentor junior resources to reach full potential – Required.
- Be a strong enabler and collaborator – willingness to share ideas, documentation and best practices.
- Ability to work across Global teams and different time-zones
- Excellent interpersonal, oral/presentation and written communication skills in both technical and non-technical language.
- Pursue ongoing learning opportunities to strengthen skill-sets
Understanding of Web technologies
Expert Apache web server hands-on experience including all of the following:
• Rewrite rules , redirects, regular expressions • Proxy – use of Reverse Proxies , restrictive forward proxies, Apache Mod proxy configuration and tuning/tweaking.• Certificate , SSL configs, TLS , ciphers configurations and hardening. • Apache debugging, significant experience in analysis of logs and troubleshoot issues.
• Ability to compile Apache from source distribution.
Expert understanding of java based application servers (AEM , jboss , Tomcat) :
Advanced JVM configuration
• Run time parameters• Garbage Collection• Heap configurations 10 years demonstrated experience Debugging heap dump , thread dumps, memory leaks
10 year demonstrated experience with analysis of logs and troubleshoot issues.
Expert Linux skills
Scripting languages – Shell , Perl , Python10 years experience with system admin skills on Linux • Patching and system monitoring• Experience in compiling packages using gcc etc • Experience with security hardening of Linux systems and services.
Hands-on AWS skills
• 5 years experience in solutioning and configuration/deployment of AWS services including - Elastic Beanstalk , Kubernetes, API GW , Load balancers, Target groups, ec2 instances , S3 , RDS etc. • Understand network routes and subnets (basic understanding) and VPC’s. • System updates/patching
• Using Cloudwatch and other alert mechanisms. Experience with Network configuration, DNS routes and subnets, Zone configuration and troubleshooting.
Experience in Vulnerabilities management and mitigations. Ability to decipher penetration test/vulnerability reports and figure out mitigations.
Experience in implementing monitoring/alerting frameworks for tracking outages, capturing analytics using tools like NewRelic etc.
GTM experience (Akamai , Cloudflare) desirable.
Strong ability to troubleshoot and drive issues to closure. Previous experience in managing a full ecosystem would be an added bonus (across all layers like web , app , db , third party integrations etc )
Exceptional ability and strong desire to lead and mentor junior resources to reach full potential – Required.