Reading Time: 3 minutes
In this blog, we will learn about how to solve producer and consumer problems using Kafka and Docker. It’s going to be interesting. So stay tuned
Kafka is a high-performance, real-time, and also publish-subscribe messaging system. It is an open-source tool as well as a part of Apache Projects.
Some Characteristics of Kafka are:-
- Firstly it is a distributed and partitioned messaging system.
- Secondly, Kafka is highly scalable.
- Thirdly it can process and send millions of messages per second to several receivers.
Standard terms used in Kafka:-
- Messages represent data, such as lines in a log file and system error messages.
- A Topic is a category of messages in Kafka and the producer organizes the messages into topics. Consumers read the messages from the topics.
- The producer is the creator of messages likewise the consumer reads messages in Kafka.
- Partitions allow messages on a topic to be distributed to multiple servers.
Let’s talk about the producer’s and consumers’ problem
The producer and consumer problem is the classical synchronization problem, in which the producer writes messages to a shared buffer and the consumer consumes those messages.
The problem in this synchronization problem is that the producer continues to produce messages without knowing whether or not the consumer will be able to consume them. To solve this problem we use Kafka.
How is Kafka used to solve the problem?
Explanation of the above diagram
- The Kafka cluster consists of Brokers that take messages from the producer to add them to the partitions.
- The broker provides messages to consumers from the partitions and handles all the clients’ requests (producer and consumer), but keeps the data within the cluster.
- Each partition at as a message queue in the cluster.
- Additionally, the zookeeper manages the Broker.
Use of Docker
As containerized applications are running with the help of Docker similarly we will create multiple containers for these.
- Kafka Server
- Zookeeper Server
- Producer Console
- Consumer Console
First, we have to write Dockerfiles for each container
1. Kafka Server
FROM openjdk:8-jre-slim WORKDIR /kafka-server ADD kafka/ /kafka-server/ EXPOSE 9092 CMD [ "./bin/kafka-server-start.sh","config/server.properties" ]
- FROM- act as a base image for the server.
- WORKDIR- to set a working directory for all Dockerfile instructions.
- ADD- to copy any folder or file inside the current directory.
- EXPOSE- on which the server will run.
- CMD- to start the server.
FROM openjdk:8-jre-slim WORKDIR /kafka-server ADD kafka/ /kafka-server/ EXPOSE 2181 CMD ./bin/zookeeper-server-start.sh config/zookeeper.properties
FROM openjdk:8-jre-slim WORKDIR /kafka-server ADD kafka/ /kafka-server/ ENTRYPOINT ["./bin/kafka-console-producer.sh","--bootstrap-server","server:9092","--topic"] CMD [ "test" ]
FROM openjdk:8-jre-slim WORKDIR /kafka-server ADD kafka/ /kafka-server/ ENTRYPOINT [ "./bin/kafka-console-consumer.sh","--bootstrap-server","server:9092","--from-beginning","--topic" ] CMD [ "test" ]
To start the services of these containers, we write a Docker-Compose file
version: '3.6' services: zookeeper: restart: always ports: - 2181:2181 container_name: zookeeper build: context: . dockerfile: zookeeper/Dockerfile networks: - kafka volumes: - ~/Desktop/kafka-zoo:/tmp/zookeeper server: ports: - 9092:9092 container_name: server build: context: . dockerfile: server/Dockerfile depends_on: - zookeeper environment: KAFKA_BROKER_ID: 0 KAFKA_ZOOKEEPER_CONNECT: 'zookeeper:2181' KAFKA_CREATE_TOPICS: 'test' networks: - kafka volumes: - ~/Desktop/kafka-data:/tmp/kafka-logs producer: container_name: producer build: context: . dockerfile: producer/Dockerfile depends_on: - zookeeper - server networks: - kafka consumer: container_name: consumer build: context: . dockerfile: consumer/Dockerfile depends_on: - zookeeper - server networks: - kafka networks: kafka:
- In the above file, we have four services servers, zookeeper, producer, and consumer.
- Ports on which these services will run.
- Location of Dockerfiles(like server/Dockerfile).
- Networks are for communication between containers.
- Volumes to store data.
docker-compose file is used for running multiple containers
To run the docker-compose file
docker-compose up --build
Open another terminal for the producer:-
docker-compose run producer
- Always use Best Practices to write Dockerfiles.
- Moreover instead of running a single container each time. We can create a docker-compose file to run multiple containers.