Category Archives: Scalable IoT

Setting up Apache Kafka for use with an Apache ZooKeeper quorum on Ubuntu

There are lots of guides out there describing how to set up simple Apache Kafka configurations but they generally stop short of describing how to use this with a three Apache ZooKeeper quorum so that ZooKeeper isn’t a single point of failure. The configuration of machines that I am working with are running these components:

  • Server1 (static 192.168.10.11) – ZooKeeper
  • Server2 (static 192.168.10.12) – ZooKeeper
  • Server3 (static 192.168.10.13) – ZooKeeper, Kafka broker
  • Desktop (static 192.168.10.14) – Kafka producer and Kafka consumer

This setup doesn’t use multiple Kafka brokers but that’s a relatively simple extension.

Continue reading

Advertisements

Using Apache NiFi with Apache Kafka

NiFiKafkaApache NiFi is a great way of capturing and processing streams while Apache Kafka is a great way of storing stream data. There’s an excellent description here of how to configure NiFi to pass data to Kafka using MovieLens data as its source. Since I am not running HDFS I modified the example to just put the movies and tags data into Kafka and save the ratings data to a local file. Trying to stash the ratings data into Kafka doesn’t work – there is just too much of it too fast and buffers overflow. It’s pretty easy to use the Kafka console consumer to check that the data is being stored for the movies and tags topics and the local ratings.dat file will be generated also.