How to use external kafka from kubernetes ?

Reading Time: 3 minutes

Kafka is very widely used messaging infrastructure. If you are building software that require using kafka from your kubernetes cluster, you can either use strimzi.io kafka operator or use your local or aws based kafka cluster. While strimzi installation is a breeze and easy to use, one particular challenge is that it is cumbersome to access the kafka topics from local.

In this simple blog, I will walk you through how to access your local kafka from kubernetes.

Step 1

Run your local kafka. Easiest way is to get the latest community edition from confluent and unzip to a local folder. Here is the link. Then run kafka.

confluent-5.3.1 > bin/confluent local start

If you need to change any configuration, go ahead and make changes to etc/kafka/server.properties in the same folder.

Step 2

Run your kubernetes cluster.

~ > minikube start

Step 3

Usually minikube makes localhost available at 10.0.2.2 ipaddress. This works for mac OS and Ubuntu. However, please check where your local is available. To check quickly if this works is to ssh into minikube and check if the ports are available. For example, on my mac, here is what I can do to check.

~ > minikube ssh
$ ping 10.0.2.2
PING 10.0.2.2 (10.0.2.2): 56 data bytes
64 bytes from 10.0.2.2: seq=0 ttl=64 time=0.387 ms

Step 4

Copy the following yaml to a file called kafka-external.yaml and apply the service. In this case, I am exposing only kafka broker, zookeeper and schema registry.

apiVersion: v1
kind: Service
metadata:
    name: kafkalocal
spec:
    ports:
        - protocol: TCP
          port: 9092
          targetPort: 9092
---
apiVersion: v1
kind: Endpoints
metadata:
    name: kafkalocal
subsets:
    - addresses:
        - ip: 10.0.2.2
      ports:
        - port: 9092
---
apiVersion: v1
kind: Service
metadata:
    name: zoolocal
spec:
    ports:
        - protocol: TCP
          port: 2181
          targetPort: 2181
---
apiVersion: v1
kind: Endpoints
metadata:
    name: zoolocal
subsets:
    - addresses:
        - ip: 10.0.2.2
      ports:
        - port: 2181
---
apiVersion: v1
kind: Service
metadata:
    name: schemalocal
spec:
    ports:
        - protocol: TCP
          port: 8081
          targetPort: 8081
---
apiVersion: v1
kind: Endpoints
metadata:
    name: schemalocal
subsets:
    - addresses:
        - ip: 10.0.2.2
      ports:
        - port: 8081
~ > kubectl apply -f kafka-external.yaml

In the yaml, as you notice, we create and end point for kafka components (instead of a pod selector) and create a service to route the trafic to the end point.

In case if you have multiple brokers of kafka, you could replace kafka service with below yaml.

apiVersion: v1
kind: Service
metadata:
    name: kafkalocal
    labels:
        event-gateway: kafka
        type: eg
spec:
    ports:
        - protocol: TCP
          name: broker1
          port: 9092
          targetPort: 9092
        - protocol: TCP
          name: broker2
          port: 9093
          targetPort: 9093
        - protocol: TCP
          name: broker3
          port: 9094
          targetPort: 9094
---
apiVersion: v1
kind: Endpoints
metadata:
    name: kafkalocal
    labels:
        event-gateway: kafka
        type: eg
subsets:
    - addresses:
        - ip: 10.0.2.2
      ports:
        - port: 9092
          name: broker1
        - port: 9093
          name: broker2
        - port: 9094
          name: broker3

Thats it!!! You can check if the service is available and is accessible from your local minikube cluster.

Few other useful tips

Use kafkatool to play with your kafka cluster. Its very handy and I receommend using commercial version if you are a kafka lover like me. It gives you ability to read avro messages which is not available in community edition.

Second suggestion is to get dnsutils tool for various kubernetes debugging tasks.

~ > kubectl run --image gcr.io/kubernetes-e2e-test-images/dnsutils:1.3  dnsutils sleep 1000000
# now you can do various checks like finding the IP or service dns as follows
~ > kubectl exec -it dnsutils3 nslookup schemalocal

You might as well add an alias to make it easy to check service addresses as follows.

alias lookup="kubectl exec -it dnsutils -- nslookup "

Thats all for today. Thanks for reading the blog. Comments and improvements welcome.

Written by 

As an Engineer, I help customers in architecting platforms using Spark, Mesos, Cassandra, Kafka (And their commercial versions). As a partner, I guide customers in setting up the organization, processes and build top-notch teams that solve complex problems or deliver digital transformation. My interests and expertise are in Mathematics, Machine learning, Microservices, Linked data, distributed cloud infrastructure, Real-time enterprise data integration.