Monday, May 8, 2017

Notes on Apache Kafka


http://davewentzel.com/content/kafka-notes/

Clustering
Based on Zookeeper
https://www.quora.com/What-is-the-actual-role-of-ZooKeeper-in-Kafka
https://cwiki.apache.org/confluence/display/KAFKA/Kafka+data+structures+in+Zookeeper

Log compaction
Cosumer receives at least the last message of each key.
http://www.shayne.me/blog/2015/2015-06-25-everything-about-kafka-part-2/

JSON serialization
org.apache.kafka.connect.json.JsonConverter
Easier to setup
https://blog.knoldus.com/2017/01/30/kafka-sending-object-as-a-message/

AVRO serialization
io.confluent.connect.avro.AvroConverter
Recommended: schema based, fast, compact, versioning (evolution)
Schema registry needed (Confluent distro)
Schemas are automatically registered by the producers
http://cloudurable.com/blog/avro/index.html
http://cloudurable.com/blog/kafka-avro-schema-registry/index.html
https://www.slideshare.net/JeanPaulAzar1/kafka-and-avro-with-confluent-schema-registry

Compression
GZIP or Snappy
Less bandwitdh and disk space, more CPU resources.

Topics and Partitions
Multiple topics per producer: https://stackoverflow.com/questions/21376715/how-many-producers-to-create-in-kafka

Distributions
Producer
http://cloudurable.com/blog/kafka-tutorial-kafka-producer-advanced-java-examples/index.html

No comments:

Post a Comment