What are the responsibilities and job description for the Senior Cloud Performance Engineer- Remote position at ClickHouse?
About the Team
The Cloud Performance Engineering team is responsible for building cloud native ClickHouse Cloud Platform that will transform the OLAP space. Our team is looking for exceptional performance engineers with a proven track record of understanding the performance limits of different distributed databases and creating tools for measuring the performance and scalability of complex systems. The ideal candidate for this position is a distributed systems performance engineer with a strong background in database benchmarking, test automation, system engineering, performance analysis, and capacity management. This role is a unique opportunity to make a significant impact on our elastic, limitless scale, high-performance, server less clickHouse Cloud.
What will you do?
- Benchmark system performance, database performance analysis, capacity sizing and optimization.
- Ability to troubleshoot and debug application and server errors and logs and triage accordingly
- Recommend configuration tuning/optimizations for performance bottlenecks
- Work closely with ClickHouse core development team, cloud team, security team and partner with them to improve the performance of ClickHouse Cloud.
- Plan, enable, and drive Chaos initiatives across Engineering teams, based upon internal priorities
- Develop, deploy and manage tools to systematically run chaos experiments and measure impact
- Enjoy working on, and gaining a deep understanding of, large scale distributed systems
- Study the problems in the software resilience, operational, and delivery spaces
- Extend our entire backend to enable Chaos Engineering techniques in the system
- Observe running systems, and determine/prioritize innovative ways to disrupt them
About you:
- You have 6 years of relevant software development industry experience building and operating scalable, fault-tolerant, distributed systems.
- Software development experience in Go, C/C , Java, or similar.
- Experience with concurrency, multithreading, and the deployment of distributed system architectures
- Experience developing cloud infrastructure services, preferably with Kubernetes.
- Experience leading and shipping large scope technical projects in collaboration with multiple experienced engineers.
- Expertise with a public cloud provider (AWS, GCP, Azure) and their infrastructure as a service offering (e.g. EC2).
- You have excellent communication skills and the ability to work well within a team and across engineering teams.
- You are a strong problem solver and have solid production debugging skills.
- You are passionate about efficiency, availability, scalability and data governance.
- You Thrive in a fast paced environment, and see yourself as a partner with the business with the shared goal of moving the business forward.
- You have a high level of responsibility, ownership, and accountability
#LI-Remote