Title: SRE / Site Reliability Engineer
Location: TX/Dallas Hybrid/Onsite
Duration: 1 Year
Skills
- Help build a Site Reliability Engineering culture by sharing your best practices, approaches, documentation, and code with other engineering teams.
- Apply automation and software to any tasks or parts of the system that would benefit from it or are performed manually.
- Able to troubleshoot complicated issues handling OS, Networking, Database in a cloud-based SaaS environment/on-premises environment and handle live pr
- Monitor application performance, take steps to improve overall application performance and stability and follow through with implementation
- The Site Reliability Engineer is a fundamental piece of the Site Reliability Engineering team. Site Reliability Engineering is accountable for the availability, reliability, and performance of the services and platforms in a highly transactional 24x7 environment.
What you will do:
- Monitor application performance, take steps to improve overall application performance and stability, and follow through with implementation.
- Apply automation and software to any tasks or parts of the system that would benefit from it or are performed manually.
- Able to troubleshoot issues handling OS, Networking, databases in a cloud-based environment/on-premises environment and handle live production incidents, debug/troubleshoot application, and infrastructure issues, follow and implement SRE best practices.
- Coordinate with Product owners/business representatives to define Service Level Objectives and error budgets for key functionalities of the projects
- Participate in design reviews of software/components with build teams to ensure that they are built right.
- Review products prior to production deployments to validate compliance with Service level objectives
- Conduct system analysis, and configuration management and develop improvements for system software performance, availability, and reliability.
- Work closely with software engineers and QA to ensure the system is responding properly to non-functional requirements such as performance, security, and availability.
- Document system knowledge as acquired over time, create runbooks and ensure critical system information is readily available to those who need it.
- Maintain and monitor deployment of the servers, docker containers, databases, and general backend infrastructure.
- Participate in production feedback sessions, problem management calls to identify opportunities for product improvement.
What you'll bring:
- Bachelor's Degree in Computer Science or related; or equivalent combination of education and experience
- 5 years experience in full-stack application support/SRE role
- Experience in Javascript, Typescript and web development technologies
- Proficient in scripting languages such as Powershell and/or Python
- Troubleshooting experience of complex application incidents built in AWS stack
- Experience in conducting design reviews of software components and leading performance, capacity and chaos experiments.
- Extensive Experience with observability platforms (Data dog) is required. Experience with built-in browser side diagnostic tools is expected.
- Knowledge of DevOps methodologies and the tools involved such as CI/CD concepts, CI/CD tools (Jenkins, CodePipeline, etc.), and automation and configuration tools (Puppet, Ansible, etc) a plus.
- Hands on experience with AWS public cloud is a must, Project implementation experience on public cloud is a plus.
- Ability and willingness to adapt to new application stacks and new technology concepts as the business evolves over time
- Excellent communication skills, both verbal and written
- Ability to collaborate with local and remote teams in different time zones
- Ability to present/lead technical discussions with product, cloud COE, security and other support teams.
Key Skills: CI/CD, AWS, Datadog, Grafana, Jenkins, Cloudwatch
VDart Group, a global leader in technology, product, and talent management, empowers businesses with comprehensive solutions through our four distinct, industry-leading business units. With a diverse team of over 4,000 professionals across 13 countries, we deliver strong results across various industries, including Fortune 500 companies.
Leveraging our deep expertise as a global provider of resources and solutions, we serve a wide range of industry verticals, including BFSI, Automotive, Healthcare, Mobility, Energy, Life Sciences, Manufacturing, Consumer Industries, and Technology.
With over 16 years of experience, VDart has evolved to meet the needs of leading technology brands, placing and training more than 20,000 professionals and shaping the industry's future.
Our continuous reinvention, providing resources for IT solutions and unique digital solutions, has positioned us as a top growth leader in digital talent management and technology consulting.
Committed to "People, Purpose, Planet," we prioritize social responsibility and sustainability, as evidenced by our EcoVadis Bronze Medal Certification and participation in the UN Global Compact.
Our dedication to delivering strong results has earned us recognition as a trusted advisor for businesses seeking to drive innovation and growth, including many Fortune 500 companies.
Join our network! Partner with VDart Group to leverage our global network, industry expertise, and proven track record with a diverse clientele.
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.