Hi readers, In this blog, we will be setting up a Kafka Statefulset cluster using Kubernetes and also get a basic knowledge of Statefulset.
StatefulSet is the workload API object used to manage stateful applications.
Manages the deployment and scaling of a set of Pods, and provides guarantees about the ordering and uniqueness of these Pods.
Apache Kafka is an open-source stream-processing software platform developed by LinkedIn and donated to the Apache Software Foundation, written in Scala and Java.
Zookeeper keeps track of the status of the Kafka cluster nodes and it also keeps track of Kafka topics, partitions ,etc.
A Kafka producer is an application that can act as a source of data in a Kafka cluster.
The primary role of a Kafka consumer is to take Kafka connection and consumer properties to read records from the appropriate Kafka broker.
Set-up Kubernetes cluster
We will be setting up our Kubernetes cluster on the Goolge Cloud platform. Here are some basic steps which let you set Kafka on google cluster.
Select any project on which you want to set-up clusters. Hover over Kubernetes Engine then select the cluster option.
Create Cluster option and set your cluster according to your uses. I have set-up basic cluster for sample Kafka application.
Give a name to your Cluster and change setting if have other specific need.
I have given the 2 CPU Core for each node for proper resource availability.
Install gcloud on your system then run the following command to connect the cluster to your system.
$ gcloud container clusters get-credentials k8 --zone us-central1-a --project knoldus-264306 Fetching cluster endpoint and auth data. kubeconfig entry generated for k8.
After running the command you will get following statements.Now verify the cluster by checking list of nodes.
$ kubectl get nodes NAME STATUS ROLES AGE VERSION gke-k8-default-pool-de2de537-4n5g Ready <none> 27h v1.13.11-gke.14 gke-k8-default-pool-de2de537-7dj1 Ready <none> 27h v1.13.11-gke.14 gke-k8-pool-1-36b3b91a-j56f Ready <none> 27h v1.13.11-gke.14
As we can see here i have set-up cluster of 3 nodes.
$ kubectl apply -f zookeeper.yaml service/zk-cs created poddisruptionbudget.policy/zk-pdb created statefulset.apps/zk created
Runing the zookeeper.yaml will create the zookeeper service, poddisruptionbudget and statefulset.
$ kubectl apply -f kafka.yaml poddisruptionbudget.policy/kafka-pdb created statefulset.apps/kafka created
Running the kafka.yaml will create Kafka service, poddisruptionbudget and statefulset.
$ kubectl run -ti --image=gcr.io/google\_containers/kubernetes-kafka:1.0-10.2.1 kafka-produce --restart=Never --rm -- kafka-console-producer.sh --topic test -broker-list kafka-0.kafka-hs.default.svc.cluster.local:9093,kafka-1.kafka-hs.default.svc.cluster.local:9093,kafka-2.kafka-hs.default.svc.cluster.local:9093 If you don't see a command prompt, try pressing enter. knoldus welcome
Running the kafka producer with given command .
$ kubectl run -ti --image=gcr.io/google\_containers/kubernetes-kafka:1.0-10.2.1 kafka-consume --restart=Never --rm -- kafka-console-consumer.sh --topic test -bootstrap-server kafka-0.kafka-hs.default.svc.cluster.local:9093 If you don't see a command prompt, try pressing enter. knoldus welcome
And then we create Kafka consumer and we can see that is ready to consume data. I have input data as Knoldus, welcome to producer and it has been processed to the consumer.
Kafka runs as a cluster of brokers, and these brokers can be deployed across a Kubernetes system and made to land on different workers across separate fault domains. Kubernetes automatically recovers pods when nodes or containers fail, so it can do this for your brokers too.