What are the responsibilities and job description for the Site Reliability Engineer position at Tentek, Inc.?

****Hybrid (some sites require 2 days, others 3 days, and yet others 4 days onsite depending on location. Candidates may sit in either Anaheim, CA, Glendale/Burbank, CA, Orlando, Florida, or Seattle, WA

W2 candidates only - No 1099 or C2C candidates please!

The role will support/cover chaos testing for R project to increase coverage of the DX portfolio and roll out resiliency mitigations.

Top skills sets needed: Build/Release, Unix System Administration, IaaS (Terraform, Helm, or Chef), experience launching products in a variety hosting solutions including Google, AWS, Azure, SalesForce) and private cloud systems), Experience with chaos testing and relevant software (Gremlin, FIS), Golang or Python, Node.js, Java, CI/CD (Jenkins or Gitlab), Networking basics, OAuth2, etc.

Manager Call Notes:

The following is additional info on this position from a qualifying call with the manager:

The work locations candidates may reside in are Orlando (FL), Seattle (WA), Glendale (CA) and Anaheim (CA).

• This is a hybrid work schedule ranging from 2 – 4 days depending on the policy of the work location.

Some sites require 2 days, some 3 days and Amy believed that Glendale may require 4 days/week onsite.

• Candidate should have an operational background and have good comm skills.

• The candidate will be working with both product and application teams.

• Chaos testing experience is highly preferred but not mandatory.

Experienced with Gremlin or FIS is a plus and experience with other chaos testing software is acceptable, but more importantly is that the candidate understands the concept of chaos testing.

• Preferred programming languages are Goland or Python and the manager would like someone close to intermediate level.

• All of other key requirements are your typical SRE skills: CI/CD, Kubernetes, Docker, Terraform, Build & Release, Cloud proficiency with either AWS, Azure or GCP (AWS would be preferred), ECS, Monitoring tools (Splunk, AppDynamics, Grafana, Prometheus, etc), Helm, Chef

• Candidate may be required to be on call on a rotating basis.

EXTERNAL JOB DESCRIPTION:

Our Mission statement

● Reduce/Eliminate Guest Impacting Incidents/Outages across the Guest Experience portfolio

● Allow the product teams to focus on development and enhancement of our Products

Qualities we are looking for:

● You like working with clients - you will work with customers/product engineering to gather requirements. You like hearing stories.

● You have a passion for improvement - you have passion for improving processes (e.g. through less code, fewer manual steps, fewer systems, improving velocity).

● You are law-abiding but agent-of-change - you will advocate compliance with known standards and engage engineers to improve upon processes

● You are a team player - you mentor others and contribute support documentation; here, heroes work at enriching the team

● You can multitask - you are action oriented, capable of working concurrent projects

● You have a developer mindset and are comfortable writing code

● With an operations mindset you have some experience in maintaining production systems

Expectations

In this job you are:

● Responsible for creating breakdown of tasks to meet project objectives

● Responsible for on time ticket and task completion

● Responsible for turning strategy into multiple project objectives

● Responsible for sharing their work/experiences with the greater org

You will:

● Create/maintain/improve/troubleshoot SDLC pipelines

● Create/maintain/improve/troubleshoot monitoring technologies

● Create/maintain/improve/troubleshoot infrastructure technologies (cloud and on prem)

● Create/maintain/improve documentation on the technologies that the team builds

● Shadow operation and engineering team members in their areas of subject matter expertise

Basic Qualifications

● Have expert Build/Release skills - you will work with product development teams across the enterprise to test in code delivery SDLC pipelines

● Have expert monitoring skills - you will work on ensuring the tools that keep monitoring are up and effective at notifying guest- facing issues.

● Have expert team communication skills - you will work to ensure that the larger team understands and approves of their solutions.

● Have expert technical fundamentals - you must have expert level command of Unix System Administration duties

● Have experience in the public cloud - you are proficient with launching products in a variety of hosting solutions, including public

○ (Google, AWS, Azure, SalesForce) and private cloud systems.

● Have experience in Infrastructure as Code ( IAAS ) - you subscribe to Infrastructure as code mindset (Terraform, Helm, Chef)

● Experience with chaos testing and relevant software (Gremlin, FIS)

● Pursuing a degree in Computer Science or related technical experience and authorized to work in the U.S. without requiring sponsorship now or in the future.

Preferred Qualifications

● Previous internship or large scale project experience

● Experienced with at least one of the following languages: Golang or Python

● Familiarity with NodeJs, Java

● Have worked with CI/CD tooling such as Jenkins or Gitlab

● Preferred experience with alerting and monitoring tools such Appdynamics and Splunk

● Familiarity with:

○ SDLC Build and Release processes

○ Building docker images

○ Container orchestration: Kubernetes and ECS

● Proficiency with one of the following cloud providers: AWS, Google, Microsoft)

● Proficiency with:

○ Terraform, Helm or Chef*

○ Networking basics (routing, firewalls, AWS security groups)

○ Troubleshooting / analysis of applications: Splunk, appdynamics, grafana, etc

○ OS performance troubleshooting and ability to install and configure operating system packages

● Familiarity with

○ Oauth2

○ Security principles on patching, compliance, change control process

Required Education ● Pursuing a degree in Computer Science or related tech

Apply for this job

Receive alerts for other Site Reliability Engineer job openings

Site Reliability Engineer

What are the responsibilities and job description for the Site Reliability Engineer position at Tentek, Inc.?

What is the career path for a Site Reliability Engineer?

Job openings at Tentek, Inc.

Not the job you're looking for? Here are some other Site Reliability Engineer jobs in the Glendale, CA area that may be a better fit.

We don't have any other Site Reliability Engineer jobs in the Glendale, CA area right now.

AI Assistant is available now!