What are the responsibilities and job description for the Kafka SRE position at SAGE IT?
Position: Kafka SRE
Location: Austin, TX (Onsite)
Long Term Contract
Job Description:
Job description for Kafka SRE:
- Carry out SRE duties for Kafka Streaming Platform.
- Have thorough understanding on the Kafka architecture along with the concepts of Producer, Consumer, topics, partitions etc.
- Keep an eye on the platforms and adhere to runbooks/SOPs to manage platform and application problems.
- Familiarize yourself with the cluster maintenance processes and implement changes as per the documented installation and validation plans.
- Showcase robust troubleshooting and debugging skills, aiming to pinpoint and rectify the issue, while also offering advice on how to prevent such problems in the future.
- Conduct thorough root cause analysis of major production incidents, document for future reference, and put in place proactive measures to enhance system reliability.
- Experience with Cloud infrastructure in production environment will be added advantage for this role.
- Automate routine tasks using scripts or automation tools to lessen manual work, decrease the chance of human errors, and boost system reliability.
- Candidate should work as hybrid model from the first day of joining.
- Candidate should work 3 days(Monday, Wednesday, Thursday) from office.
- Candidates need to work as per the roster, might need to work in weekend once in a month, will get comp-off in consecutive week.
Technical Skills required:
- At least 2-3 years of experience for a junior level role and 5 for mid-level/senior level working as a Site reliability engineer for Kafka Platform.
- Deep level Knowledge on core Kafka components like producers, consumers, topics, partitions etc.
- Troubleshooting both Kafka platform service, application problems and identifying the root cause.
- Hands on experience with Cloud technology will be added advantages.
- Writing Ansible playbooks and automate manual tasks using Ansible, shell scripting and python.
- Should be familiar with Unix/Linux system internals, networking, and distributed systems.