What are the responsibilities and job description for the Staff Site Reliability Engineer Cloud Platform position at Zilliz?
What you will do :
- Work at the intersection of development and site reliability. Creating SRE tools and systems, as well as supporting existing infrastructure and platforms.
- Ensure the reliability, availability, and performance of Zilliz’s distributed database systems.
- Develop and implement strategies for monitoring, incident management, and disaster recovery.
- Automate system operations and maintenance tasks to improve efficiency and reduce manual intervention.
- Design and build tools to manage and monitor infrastructure, ensuring scalability and robustness.
- Collaborate with software engineers to enhance system reliability, scalability, and performance.
- Maintain and improve the CI / CD pipeline to ensure smooth and rapid deployment of changes.
- Actively contribute to the Milvus open-source community, focusing on improving reliability and operational efficiency.
What we are looking for :
J-18808-Ljbffr