Knoldus Blog Audio
In this article, we are going to create Kafka Clusters on the GCP platform. We can do it in various ways like uploading Kafka directory to GCP, creating multiple zookeepers, by creating multiple copies of the server.properties file, etc. But, In this article, we are doing it in a simpler way i.e. by Creating a Kafka Cluster (with replication). Let’s Start…
What is GCP?
GCP stands for Google Cloud Platform. Google’s offered a suite of public cloud services. Google Cloud Platform provides IAAS, PAAS, and Serverless Computing Environments.
We generally use GCP due to the following reasons:
- Run your apps wherever you need them
- Make smarter decisions with the leading data platform
- Run-on the cleanest cloud in the industry
- Operate confidently with advanced security tools
- Transform how you connect and collaborate
- Get customized solutions for your specific industry
- also, Save money, increase efficiency, and optimize spend
In addition, explore the official documentation of GCP
Kafka is a distributed streaming platform which is used for creating and processing real-time data streams. It works on Pub-Sub Messaging System (Publisher – Subscriber).
Apache Kafka is used due to various advantages like it is Fast, Scalable, Fault-Tolerant messaging System.
- Producer: Application that sends message records(data) to Kafka server.
- Consumer: Application that receives message records(data) from Kafka server.
- Broker: Kafka Server that acts as an agent/broker to exchange messages.
- Cluster: Group of computers, each running one instance of Kafka broker.
- Topic: Arbitrary name given to data stream.
- Zookeeper: Server/ broker that stores a bunch of shared pieces of information.
After that, have a look at Kafka documentation
Installing Kafka in GCP:
Firstly, we must create a GCP account using Gmail ID
Go to the Navigation Menu and choose Marketplace
Select Kafka Cluster (with replication) VM Image.
Click the Launch button.
Navigation Menu → Marketplace → Kafka Cluster (with replication) → Launch
Now, Fill up the labeled fields as per the requirements and budget. Such as Deployment name, Zone, Instance Count, Machine type, etc.
After this, Click the DEPLOY button
In the Cloud Console, go to the VM instances page. or (Home → Dashboard → Resources)
click SSH in the row of the instance that you want to connect to.
We can start and stop VM as per the requirement. It will also affect the billing account (after the free trial).
All services are preconfigured in Kafka VM and ready to up and running. [ like Kafka broker port: 9092 (default), Zookeeper port: 2181 (default), Kafka broker address: IP of the VM, etc. ]
All Kafka files are stored in opt/kafka/config directory
Starting the Zookeeper and Starting the Kafka Server
As the Kafka VM comes pre-configured. So, no need to start them.
1. Creating a Topic
Methods to create Kafka topic, Such as:
Using default Properties
kafka-topics.sh --create --topic Testing --bootstrap-server localhost:9092
(Topic “Testing”, PartitionCount: 1, ReplicationFactor: 1, Partition: 0, Leader: 1, Replicas: 1, Isr: 1)
kafka-topics.sh --create --topic Testing --partitions 3 --replication-factor 2 --bootstrap-server localhost:9092
Finding existing Topics [Optional]
It will show the existing topics.
kafka-topics.sh --list --zookeeper localhost:2181
2. Producing Messages to a Topic.
Message can be in any format but is always treated as an array of bytes
Producing using the console
kafka-console-producer.sh --topic Testing --bootstrap-server localhost:9092
Producing from a file
kafka-console-producer.sh --topic Testing --bootstrap-server localhost:9092 < “FILE_PATH”
3. Consuming Messages from a Topic.
Consume using the console
kafka-console-consumer.sh --topic Testing --from-beginning --bootstrap-server localhost:9092
Consuming from a file
kafka-console-consumer.sh --topic Testing --from-beginning --bootstrap-server localhost:9092 > “FILE_PATH”
In this whole process, the consumer keeps consuming all the messages produced by the producer, in case of failure of any of the instances too. As we have created multiple instances, if any one of them fails then others will automatically consume the messages which are being produced by the producer.
Note: If you are not able to execute these commands, either go to the root directory or type cd ../.. to your terminal.
After that, If you want to delete the existing topic, execute:
kafka-topics.sh --zookeeper localhost:2181 --delete --topic Testing
In short, we all get the basic idea about GCP, Apache Kafka, and the working of Kafka on GCP.
For better understanding, do hands-on.