What are the responsibilities and job description for the SRE position at Fierceli Inc, MBE, SBE?
We are seeking a skilled Site Reliability Engineer (SRE)/Data Infrastructure Engineer The ideal candidate will have expertise in Kubernetes, big data technologies, and proficiency in Python, Go, or Java. This role focuses on ensuring the reliability, scalability, and performance of our distributed systems and big data infrastructure.
Key Responsibilities
Strong experience with Kubernetes, including cluster management and application deployment
Key Responsibilities
- Design, implement, and maintain Kubernetes-based infrastructure for deploying and scaling applications
- Develop and optimize big data pipelines using Apache Spark and other related technologies
- Write efficient, production-grade code in Python, Go, or Java to automate processes and improve system reliability
- Implement observability solutions, including logs, metrics, traces, and profiles
- Collaborate with development teams to design and troubleshoot solutions for complex infrastructure issues
- Ensure high availability, fault tolerance, and disaster recovery for critical systems
- Optimize resource utilization and performance of big data processing workflows
Strong experience with Kubernetes, including cluster management and application deployment
- Proficiency in at least one of the following programming languages: Python, Go, or Java
- Hands-on experience with Apache Spark and big data processing frameworks
- Solid understanding of Linux systems and networking
- Experience with cloud platforms such as AWS, GCP, or Azure
- Familiarity with CI/CD pipelines and version control systems (e.g., Git)
- Understanding of distributed systems and microservices architecture