Job Description
Site Reliability Engineer (SRE) – Networking & Core Systems Specialist
Location : Las Vegas NV (Work from Office)
Fulltime
Position Overview
We are seeking a highly skilled Site Reliability Engineer (SRE) with expertise in networking and core infrastructure systems to join our team. This role focuses on building and maintaining reliable, scalable, and secure infrastructure to support our growing operations. The ideal candidate will have a strong background in networking, systems design, and automation, coupled with a passion for improving system reliability and performance.
Key Responsibilities
System Reliability & Performance :
- Ensure the availability, performance, and scalability of critical networking and core systems.
Network Design & Optimization :
Design, implement, and maintain highly available and secure network architectures, including LAN, WAN, VPN, and SDN solutions.Incident Response :
Monitor system health, respond to incidents, and perform root cause analysis to prevent recurrence.Automation & Infrastructure as Code (IaC) :
Develop and maintain scripts and automation tools to manage infrastructure, deployments, and configurations.Capacity Planning :
Analyse system performance trends and plan for future capacity needs.Collaboration :
Work closely with development, security, and operations teams to design systems that meet operational and business requirements.Documentation & Training :
Maintain clear documentation of systems and processes, and mentor team members on best practices.Security & Compliance :
Implement and enforce security protocols and ensure compliance with organizational and industry standards.Required Skills & Experience
Networking Expertise :Strong understanding of networking protocols (TCP / IP, DNS, HTTP, BGP, etc.).Experience with network monitoring and troubleshooting tools (e.g., Wireshark, NetFlow, or similar).Familiarity with load balancing, CDN, and DDoS mitigation strategies.Systems Engineering :In-depth knowledge of Linux / Unix systems.Experience managing large-scale distributed systems and services.Proficiency in storage and virtualization technologies (e.g., VMware, Kubernetes, Docker).Programming & Automation :Proficiency in one or more scripting / programming languages (Python, Go, Bash, etc.).Experience with configuration management tools (e.g., Ansible, Terraform, Chef, or Puppet).Monitoring & Observability :Familiarity with monitoring and alerting tools (Prometheus, Grafana, ELK stack, or equivalent).Cloud & Hybrid Environments :Experience with cloud providers (AWS, Azure, GCP) and hybrid cloud setups.Problem-Solving :Strong analytical and troubleshooting skills in complex environments.Preferred Qualifications
Certifications : CCNA / CCNP, RHCE, or equivalent networking and systems certifications.Experience with SD-WAN or advanced network virtualization technologies.Familiarity with CI / CD pipelines and DevOps practices.Exposure to advanced security frameworks and technologies (e.g., Zero Trust Architecture).