What are the responsibilities and job description for the Site Reliability Engineer (Not DevOps) position at TrueSkilla?

Role: Site Reliability and operations Engineer (SRE) (Not DevOps)

Work Location: IRVING, TX (Hybrid)- 3 days

Type: W2 Only

Duration: 12 Months

We are looking for a highly skilled Site Reliability and operations Engineer (SRE) with extensive experience in Kubernetes-based distributed caching and compute grid solutions. This role requires a strong foundation in software development, infrastructure automation, and reliability engineering. You will be responsible for designing, implementing, and maintaining high-performance distributed systems, ensuring reliability, scalability, and efficiency.

Development & Implementation:

• Design, develop, and optimize distributed caching and compute grid solutions on Kubernetes/OpenShift

• Understanding of microservices and containerized workloads using Kubernetes, Docker, and Helm.

• Implement high-throughput compute grid solutions using IBM Spectrum Symphony, Tibco Grid Server or similar technologies.

• Optimize application performance by leveraging parallel compute strategies, load balancing, and efficient data distribution.

Site Reliability Engineering (SRE):

• Ensure high availability, scalability, and reliability of distributed systems.

• Implement observability, logging, and monitoring using tools like Prometheus, Grafana, ELK, or OpenTelemetry.

• Automate infrastructure provisioning and deployments using Ansible, and Helm Charts.

• Understanding of CI/CD pipelines for seamless software deployment.

• Troubleshoot and resolve incidents related to platform, infrastructure and distributed compute platforms, ensuring minimal downtime.

Required Skills & Qualifications:

• Strong experience in Kubernetes (OpenShift and on-prem/cloud clusters).•

• Understanding of programming languages like Java, Go, or Python.

• Experience with containerization technologies (Docker, Helm, etc.).

• Strong knowledge of CI/CD pipelines (Jenkins, ArgoCD, GitHub Actions).

• Hands-on experience with observability tools (Prometheus, Grafana, Loki, Jaeger).

• Understanding of networking, service meshes (Istio/Linkerd), and security best practices in Kubernetes.

• Experience with multi-cluster and hybrid cloud Kubernetes deployments.

Salary : $80 - $90

Site Reliability Engineer - Observability

Lensa -

Dallas, TX

View Job Details

Senior Site Reliability Engineer

Cytracom -

Mc Kinney, TX

View Job Details

Sr Site Reliability Engineer

Talent Groups -

Mc Kinney, TX

View Job Details

Apply for this job

Receive alerts for other Site Reliability Engineer (Not DevOps) job openings

Not the job you're looking for? Here are some other Site Reliability Engineer (Not DevOps) jobs in the Irving, TX area that may be a better fit.

Site Reliability Engineer (Not DevOps)

What are the responsibilities and job description for the Site Reliability Engineer (Not DevOps) position at TrueSkilla?

What is the career path for a Site Reliability Engineer (Not DevOps)?

Not the job you're looking for? Here are some other Site Reliability Engineer (Not DevOps) jobs in the Irving, TX area that may be a better fit.

We don't have any other Site Reliability Engineer (Not DevOps) jobs in the Irving, TX area right now.

AI Assistant is available now!