Demo

Site Reliability engineering (SRE)

Omni Inclusive
San Leandro, CA Full Time
POSTED ON 2/8/2025
AVAILABLE BEFORE 5/7/2025

Need SRE candidate with good Java Dev background interested in this role with strong hands-on experience in building dashboards and setting up alerts using Splunk, Grafana and GCL.

Required Qualifications :

  • 10 years of Software Engineering experience, or equivalent demonstrated through one or a combination of the following : work experience, training, military experience, education
  • 10 years of experience in Production support / Site Reliability Engineering teams with continued focus on improving Platform health
  • Familiar with Agile or other rapid application development practices
  • Hands-on expertise with Automated testing, Process Automation & building dashboards using APM tools.
  • Experience with distributed (multi-tiered) systems, algorithms, relational databases, and NoSQL databases.
  • Knowledge & Exposure caching tools (Redis, memcache) or messaging tools such as MQ, Kafka.
  • Must have working knowledge of APM tools such as splunk, GCL, ELK, Grafana, Prometheus etc.
  • Able to create Dashboards using GCL / Splunk / ELK and setup alerts.
  • Working knowledge of CICD is a plus - Source control like Git, Continuous Integration - Jenkins / UCD Release etc. .
  • Ability to work with Engineering teams across the ecosystem such as Security, Networking & Infrastructure challenges which can impact platform health & resiliency.
  • Shell Scripting / DevOps tools like Ansible with good knowledge of yaml file to write playbooks .
  • Experience with distributed storage technologies like NFS as well as dynamic resource management frameworks PCF, Kubernetes / OpenShift, AWS or Azure.
  • Tech Stack : Java / J2EE (Spring, Spring Boot, Python, Shell Scripting, Kafka, Oracle, MongoDB etc.).
  • Able to work on shift duty in a 12 / 7 support organization.

Job Expectations :

  • You will be a core member of a SRE support team, will be utilizing the latest technology tools to write code, test cases, working with API specs and automate to maintain the resiliency, performance and availability of Digital Sales & Marketing platforms.
  • Strong & relevant experience in supporting Web / API platforms built using Java / java script Stack (Spring / Spring boot, Javascript -Angular / react)
  • Proficiency in dealing with Legacy infrastructure along with cloud infrastructure (on prem & 3rd party) such as PCF or Azure.
  • Identifying opportunities to adopt to new technologies while improving the efficiency by removing toil and continues to drive efficiency & optimization.
  • Proactive monitoring of app performance through Splunk, App dashboards, App dynamics & Dynatrace etc.
  • Represent Platform engineering teams during production outages and collaborate with engineering teams to resolve production outages. Collaborate with stake holders across engineering function to own / derive RCA & work towards permanent resolution.
  • Plan, support, execute and comply with governance programs / processes in support of a strong control environment in your functional area. Leverage process documentation to improve operational controls and identify and remediate process deficiencies.
  • Proactively identify, communicate, mitigate and escalate risk originating from non-compliance of processes, operational errors, and data integrity issues in all applicable processes.
  • Ability to influence SRE practices within and outside teams to enable a strong DevOps culture within the organization
  • Able to work on shift duty in a 12 / 7 support organization.
  • Responsible for working with Engineering teams to maintain the SLAs & SLOs. Constantly looking out for opportunities to improve platform metrics & communicate the same to stakeholders.
  • Exposure and proficiency in different API styles such as SOAP, REST, Micro services etc.
  • Working knowledge of Unix, Linux and Postman
  • If your compensation planning software is too rigid to deploy winning incentive strategies, it’s time to find an adaptable solution. Compensation Planning
    Enhance your organization's compensation strategy with salary data sets that HR and team managers can use to pay your staff right. Surveys & Data Sets

    What is the career path for a Site Reliability engineering (SRE)?

    Sign up to receive alerts about other jobs on the Site Reliability engineering (SRE) career path by checking the boxes next to the positions that interest you.
    Income Estimation: 
    $154,184 - $199,940
    Income Estimation: 
    $189,563 - $242,917
    Income Estimation: 
    $71,493 - $96,419
    Income Estimation: 
    $92,369 - $122,605
    Income Estimation: 
    $92,369 - $122,605
    Income Estimation: 
    $117,024 - $149,811
    Income Estimation: 
    $117,024 - $149,811
    Income Estimation: 
    $137,568 - $176,908
    Income Estimation: 
    $137,568 - $176,908
    Income Estimation: 
    $158,960 - $205,707
    View Core, Job Family, and Industry Job Skills and Competency Data for more than 15,000 Job Titles Skills Library

    Job openings at Omni Inclusive

    Omni Inclusive
    Hired Organization Address Johnston, IA Full Time
    Job Description : We are seeking a highly skilled Onsite Lead with expertise in a range of network infrastructure techno...
    Omni Inclusive
    Hired Organization Address Phoenix, AZ Full Time
    Job Title : Product Owner Top 3 Must Haves : 1. Good Domain understanding in Loyalty and Rewards space is greatly apprec...
    Omni Inclusive
    Hired Organization Address Franklin, TN Full Time
    JD : Requirement Gathering and Test Case Design : Work closely with subject matter experts, product owners, and develope...
    Omni Inclusive
    Hired Organization Address St. Louis, MO Full Time
    Experience : 11-15 Years Full Stack Lead - We are looking for an engineer with extensive Backend service Knowledge, Data...

    Not the job you're looking for? Here are some other Site Reliability engineering (SRE) jobs in the San Leandro, CA area that may be a better fit.

    Site Reliability Engineering Manager

    Litmus7, San Ramon, CA

    Site Reliability Engineer

    NTT DATA, Inc., San Leandro, CA

    AI Assistant is available now!

    Feel free to start your new journey!