What are the responsibilities and job description for the Devops Engineer position at Beacon Systems, Inc?
Job Details
DevOps Engineer Infrastructure Automation
Location: Dallas, TX
Contract Duration: Initially 12 Months
The Role We are seeking a highly skilled and motivated Senior DevOps Engineer with a strong background in infrastructure, compute, and storage automation to join our Storage and Compute Platform Management team. This is a contractor role focused on building scalable, reliable, and automated infrastructure systems that power our high-performance computing (HPC) and storage environments.
The successful candidate will play a key role in automating the provisioning, configuration, monitoring, and management of our compute and storage infrastructure, which supports multimegawatt CPU and GPU farms used for cutting-edge quantitative research and machine learning workloads. This is an exciting opportunity for someone passionate about infrastructure at scale, automation, and performance, with a forward-thinking mindset and a collaborative attitude.
Key Responsibilities
- Design, develop, and maintain automation frameworks for provisioning and managing HPC and storage infrastructure.
- Implement infrastructure-as-code and configuration management best practices to ensure consistency and repeatability.
- Collaborate with platform teams to improve scalability, reliability, and observability of systems.
- Troubleshoot performance, reliability, and scale issues across a variety of infrastructure components.
- Drive continuous improvement through automation, performance tuning, and capacity planning.
- Support the deployment and operations of distributed systems and services used across the organization.
Who Are We Looking For? We are looking for someone who thrives in complex environments and enjoys working on critical infrastructure. You are detail-oriented, a strong communicator, and a natural problem solver with a passion for automation.
The ideal candidate will have:
- Extensive experience in infrastructure engineering, with a focus on compute and storage platforms in large-scale or high-performance environments.
- A solid track record of leading and delivering successful technical infrastructure projects.
- Strong experience with Python programming, particularly for automation, scripting, and systems integration.
- Deep familiarity with CI/CD practices, pipelines, and tools (e.g., Jenkins, GitLab CI, ArgoCD).
- Expertise in configuration management and infrastructure-as-code tools such as Ansible, Terraform, and Puppet.
- Proven experience in monitoring and observability using tools such as Prometheus, Grafana, ELK stack, or similar.
- Solid knowledge of Linux system administration and networking fundamentals.
- Hands-on experience with containerization and orchestration platforms (Docker and Kubernetes).
- Familiarity with public cloud services (AWS, Azure, Google Cloud Platform) and hybrid infrastructure models.
- Exposure to HPC (High Performance Computing) environments and/or large-scale storage infrastructure is highly desirable.
- A proactive and collaborative mindset, with a focus on continuous improvement and innovation