Kafka usage — tips&hints

Today we will talk about Apache Kafka (opens new window). As you probably already know, Kafka is “a distributed streaming platform”, so it could be used as Queue, Message Bus, etc.

But this story isn’t about Kafka it self. It is about issues I faced recently developing one Kafka based service. Small application consuming messages from defined topic and validating them according to the schema. Nothing like rocket science. But.

  1. Upfront check version of your Kafka server — even minor version difference could break API. Like server version isn’t compatible with client Just pay attention.
  2. Check which compression mechanism is used on your Kafka server. GZip (opens new window) and Snappy (opens new window) are supported. In my case it was Snappy, and you know what? It requires glibc (opens new window). And my base docker image was Alpine. And Alpine does not have support of glibc. So, choose the right base image.
  3. If you use Kafka Streams API and want to split one stream into two based on some predicate your could pass this predicate to _branch_ method along with predicate that always returns true — some sort of _everithingElsePredicate(_Predicate { _, _ -> true }).
  4. kafka-console-consumer and kafka-console-producer — your best friends to debug your Kafka app.
  5. If you are going to scale you app to process some topic in parallel way don’t forget to increase number of partitions on that topic. But keep in mind that it is only possible to increase number of partitions but there is no way back.

That’s all I have in mind to mention.