Job Description : Kafka Architect (Platform & Developer Focus)
Location : REMOTE
Position Overview : We are seeking a highly experienced Kafka Architect with 8-10 years of expertise in Kafka administration. This is a strategic role, not for a developer or admin, but for someone with deep architectural knowledge of Kafka and a strong ability to communicate complex concepts effectively. The ideal candidate will have extensive experience in designing, documenting, and optimizing both platform and developer elements of Kafka, with a focus on messaging and streaming capabilities. The candidate should be well-versed in implementing Kafka in diverse environments, with the ability to generalize concepts across different Kafka implementations (including Confluent and others).
Key Responsibilities :
Platform Focus : Infrastructure, Deployment, Operations, and Security
- Cluster Sizing & Capacity Planning :
Define brokers per cluster, partition count, and replication factor based on data volume, retention policies, and throughput needs.
Design scaling configurations (horizontal and vertical) for Kafka clusters.Establish optimal network configurations to ensure low-latency performance.Define Zookeeper configurations (if not using KRaft mode) or alternative metadata management.Storage Considerations :Provide recommendations for optimal disk configurations, with a focus on NVMe SSDs.
High Availability & Fault Tolerance :Define optimal replication factor (RF) strategies for both production and non-production environments.
Design broker failover strategies and leader election mechanisms.Develop multi-region and multi-AZ deployment strategies for Kafka.Security, Audit & Compliance :Implement and recommend Kafka authentication strategies (SASL, Kerberos, OAuth, TLS).
Design authorization mechanisms (ACLs, RBAC) for Kafka.Advise on key management strategies, key rotation, and secure storage.Define auditing best practices for tracking access and changes to Kafka resources.Ensure encryption for both data in-transit (TLS) and at-rest (disk encryption).Advise on compliance frameworks (e.g., SOX, CCPA) and ensure Kafka adheres to necessary standards.Monitoring & Observability :Advise on metrics collection using tools like Prometheus, Grafana, and Confluent Control Center.
Implement security monitoring tools to detect and respond to real-time threats.Provide recommendations for monitoring disk usage and log aggregation (e.g., Elasticsearch, Kibana, Splunk).Implement lag monitoring strategies using tools like Burrow or Kafka UI.Developer Focus :
Data Retention & Cleanup :Define log segment configurations and cleanup policies (delete vs. compact).
Provide recommendations for Kafka compaction processes and scheduling.Advise on time-based vs. size-based retention policies to optimize resource usage.Disaster Recovery & Backup :Define strategies for cross-cluster replication and cluster linking.
Set recovery point objectives (RPO) and recovery time objectives (RTO).Define automated backup verification and recovery procedures.Develop Kafka backup strategies, including configuration and topic-level backups.Cost Optimization :Recommend strategies for optimizing consumer group performance.
Define dynamic partition rebalancing and scaling strategies.Recommend optimal data retention policies and efficient data compression formats.Qualifications :
8-10 years of experience in Kafka architecture and administration (platform-focused).Strong knowledge of Kafka internals, cluster design, security, and monitoring tools.Proven ability to document and communicate complex Kafka platform architectures.Deep understanding of Kafkas role in both messaging and streaming use cases.Experience with different Kafka implementations (e.g., Confluent, Apache Kafka, others).Strong knowledge of distributed systems, scaling, and capacity planning.Familiarity with disaster recovery strategies and backup procedures.Experience with security best practices, including authentication, encryption, and access control.Understanding of compliance regulations and how to implement them in Kafka.Familiarity with metrics collection, monitoring, and observability tools (e.g., Prometheus, Grafana, Burrow).Excellent communication skills, both written and verbal, with the ability to engage with multiple stakeholders.Preferred Skills :
Expertise in cloud-native Kafka implementations and multi-cloud architectures.Experience with KRaft mode or alternative metadata management strategies.Familiarity with automation and orchestration tools in Kafka environments.