Demo

Lead Site Reliability Engineer

VeriiPro
Irving, TX Contractor
POSTED ON 2/6/2025
AVAILABLE BEFORE 4/3/2025

In this role, you will:


·Lead complex technology initiatives including those that are companywide with broad impact

·Act as a key participant in developing standards and companywide best practices for engineering complex and large-scale technology solutions for technology engineering disciplines

·Design, code, test, debug, and document for projects and programs

·Review and analyze complex, large-scale technology solutions for tactical and strategic business objectives, enterprise technological environment, and technical challenges that require in-depth evaluation of multiple factors, including intangibles or unprecedented technical factors

·Make decisions in developing standard and companywide best practices for engineering and technology solutions requiring understanding of industry best practices and new technologies, influencing and leading technology team to meet deliverables and drive new initiatives

·Collaborate and consult with key technical experts, senior technology team, and external industry groups to resolve complex technical issues and achieve goals

·Troubleshoot, and analyze production job failures related to data, network file delivery, and server and application issues independently and provide solutions to recovery. Participate in root cause analysis and preventative actions to avoid recurring incidents.

·Participate in the buildout of automation to prevent problem recurrence, with the goal of automating response to all non-exceptional service conditions.

·Apply technology background in software engineering and systems engineering to ensure the applications on-boarded to SRE are available, have full-stack observability, are integrated with CI/CD, and always-on by introducing continuous improvement through code and automation, continuous testing (performance, functional), and provide operational insight through analytics.

·Assess the availability of critical business flows, identify service level objectives and indicators, and conduct destructive and resiliency testing to reach 99.995% availability for the firm's critical products and services leading to improved customer experience and customer satisfaction.

·Develop original and/or complex code, provide coding guidance/review, and create documentation

·Introduce enterprise capabilities, tools, and innovation to improve availability in a multi-cloud ecosystem by evolving observability, monitoring, logging, CI/CD integration, continuous testing (performance, functional, ), continuous improvement, and standardization/automation of key SRE metrics and IT Service Operations processes.

·Evolve continuous inspection capabilities code quality to identify problems before they manifest in production.

·Introduce and expand AIOps, and robotic process automation (RPA) to solve complex operational and systemic issues, and to improve availability of products to customers.

·Share support responsibilities for critical applications, to identify systemic issues, conduct blameless post mortems, root cause analysis, and introduce strategic solutions in code that solve the problem and eliminate repeat issues.

·Be willing to work non-standard business hours on an on-call basis in a 24x7x365 environment.

·Lead projects, teams, or serve as a peer mentor

 

Required Qualifications:

 

·5 years of Software Engineering experience, or equivalent demonstrated through one or a combination of the following: work experience, training, military experience, education

·5 years experience troubleshooting and systems administration experience across multiple OS Platforms: Solaris, AIX, PKS, Kubernetes, OpenShift, Linux, Windows, VMware

·3 years experience with web platforms: Java, Apache, Tomcat, Weblogic, Oracle

·2 years experience with database technologies: Basic SQL, Cassandra DB, Oracle, Postgres SQL

·2 years experience with Observability tools: Traffic Manager, Message Processor, AppDynamics, Filebeat, Basemon, etc.

·2 years experience using logging/monitoring tools: ELK, Filebeats, Splunk, Netcool, SiteScope, Kafka

 

Desired Qualifications:

 

·5 years of software development experience with languages such as Perl, Python, Java, JavaScript, Ruby, JSON, Angular, NodeJS

·2 years experience with Automation Scripting: Bash, Shell, Ansible, Terraform, Azure DevOps

·1 year of experience with Cloud technologies: PCF, Azure, AWS, GCP, etc

·2 years Incident Management System experience

·2 years experience with Agile Scrum (Daily Standup, Sprint Planning and Sprint Retrospective meetings)

·2 years experience using JIRA.

·2 years experience with Data Services platforms: Bigdata, Datalake, Hadoop, Spark.

·1 years experience with AIOPs tools: BigPanda, MoogSoft.

·Experience with one or more CI/CD Pipeline (Github, Jenkins) and Automation tools: Gradle, Maven, Git, Ansible, Puppet

·Experience with one or more Observability/Monitoring tools: Elastic, Kibana, Grafana, AppDynamics, Kafka, Big Panda, Splunk

·Experience with one or more Data/Data Structures: Kafka, Apache Airflow, Logstash, Spark, Oracle, SQL, Mongo, Hadoop, Cloudera, AWS EMR, S3

·Knowledge of one or more additional capabilities: Uipath, Robotic Processing and Capacity Management

  • ·An industry standard certification

If your compensation planning software is too rigid to deploy winning incentive strategies, it’s time to find an adaptable solution. Compensation Planning
Enhance your organization's compensation strategy with salary data sets that HR and team managers can use to pay your staff right. Surveys & Data Sets

What is the career path for a Lead Site Reliability Engineer?

Sign up to receive alerts about other jobs on the Lead Site Reliability Engineer career path by checking the boxes next to the positions that interest you.
Income Estimation: 
$97,257 - $120,701
Income Estimation: 
$123,167 - $152,295
Income Estimation: 
$123,167 - $152,295
Income Estimation: 
$146,673 - $180,130
Income Estimation: 
$146,673 - $180,130
Income Estimation: 
$176,149 - $220,529
Income Estimation: 
$77,657 - $95,021
Income Estimation: 
$97,257 - $120,701
View Core, Job Family, and Industry Job Skills and Competency Data for more than 15,000 Job Titles Skills Library

Job openings at VeriiPro

VeriiPro
Hired Organization Address Chicago, IL Contractor
Key Responsibilities Design and develop user-friendly web applications using React and associated technologies. Collabor...
VeriiPro
Hired Organization Address New York, NY Contractor
Job Description We are seeking a Cyber Security Analyst to provide on-site support for corporate and enclave firewalls, ...
VeriiPro
Hired Organization Address Columbus, OH Contractor
Responsibilities Build, maintain, and optimize data pipelines using Apache Spark. Consult and mentor other data professi...
VeriiPro
Hired Organization Address Baltimore, MD Contractor
Key Responsibilities Develop, test, and deploy Python-based Azure Function Apps. Implement triggers (HTTP, Timer, Queue)...

Not the job you're looking for? Here are some other Lead Site Reliability Engineer jobs in the Irving, TX area that may be a better fit.

Lead Site Reliability Engineer

JPMorganChase, Plano, TX

AI Assistant is available now!

Feel free to start your new journey!