Apache Kafka

Apache Kafka : Log Compaction

Reading Time: 3 minutes As we all know, most of the systems uses Kafka for distributed and real time processing of large scale of messages. Before starting on this topic, i assume that you all are familiar with basic concepts of Kafka such as brokers, partitions, topics, producer and consumer. Here we are discussing about Log Compaction. What is Log Compaction Kafka log compaction is hybrid approach that makes Continue Reading

Introduction To Apache Kafka

Reading Time: 6 minutes Introduction Apache Kafka is a framework implementation of a software bus using stream-processing . It is an open source platform, developed by the Apache Software Foundation. It is written in Scala and Java. The project aims to provide a unified, high-throughput, low-latency platform for handling real-time data feeds. Kafka can connect to external systems (for data import/export) via Kafka Connect and provides Kafka Streams, a Java stream processing library. Apache Continue Reading

Kafka Kerberos Authentication

Reading Time: 2 minutes In this article we will start looking into Kerberos authentication and will focus on the client-side configuration required to authenticate with clusters configured to use Kerberos. Kafka supports four different communication protocols between Consumers, Producers, and Brokers. Each protocol considers different security aspects, while PLAINTEXT is the old insecure communication protocol. PLAINTEXT (non-authenticated, non-encrypted) SSL (SSL authentication, encrypted) PLAINTEXT+SASL (authentication, non-encrypted) SSL+SASL (encrypted authentication, encrypted Continue Reading

Comparing Data Streaming Frameworks | Scala

Reading Time: 4 minutes In this Era of Technology, where the amount of data is growing exponentially and every bit of data holds value. Even, according to some reports, the number of bytes being generated and stored till now in the world has already exceeded the star counts in the sky. As every bit is useful so, it is very important to store them without losing any bit. When Continue Reading

Apache Kafka Use Cases and Applications

Reading Time: 3 minutes The amount of data generation has multiplied many folds overdue to the dominance of the digital age. Thus, enterprises that wish to remain in business and remain relevant in today’s world and the future must understand and learn how to manage a large volume of data through a strong, scalable, and flexible platform. Apache Kafka is one of the means to achieve it. Apache Kafka, Continue Reading

Testing Spring Embedded Kafka consumer and producer

Reading Time: 2 minutes This blog I’m talking about the Kafka testing without physical installation of Kafka services or docker container.For testing, I’m going to use another Spring library that is called spring-kafka-test. It provides much functionality to ease our job in the testing process and takes care of Kafka consumer or a producer works as expected. Maven Test Dependencies application.yml props file These are the minimum configuration for Continue Reading

Set-up Kafka Cluster On GCP

Reading Time: 4 minutes In this article, we are going to create Kafka Clusters on the GCP platform. We can do it in various ways like uploading Kafka directory to GCP, creating multiple zookeepers, by creating multiple copies of the server.properties file, etc. But, In this article, we are doing it in a simpler way i.e. by Creating a Kafka Cluster (with replication). Let’s Start… What is GCP?  GCP Continue Reading

Fault tolerance and Resiliency in Apache Kafka.

Reading Time: 5 minutes Kafka is known for it’s performance with resiliency & fault tolerance. In this article we’ll see how to make some changes in configuration to achieve fault tolerance and resilience for better architectural need. before starting the article, we need to have basic knowledge of Kafka or we can go through the Document. Apache Kafka is a distributed system, and the term fault tolerance is very Continue Reading

How to delete record from Kafka Topic : Tombstone

Reading Time: 4 minutes Hello Reader,Here we will see how can we delete records from Kafka’s topic(compacted topic as well as the non-compacted topic). Problem : GDPR: General Data Protection Regulation is a regulation that requires businesses to protect the personal data and privacy of EU citizens for transactions that occur within EU member states. CCPA: The California Consumer Privacy Act is a state-wide data privacy law that regulates Continue Reading

A Quick Demo: Kafka to Flink to Cassandra

Reading Time: 3 minutes Hi Folks!! In this blog, we are going to learn how we can integrate Flink with Kafka and Cassandra to build a simple streaming data pipeline. Apache Flink is a framework and distributed processing engine. it is used for stateful computations over unbounded and bounded data streams.Kafka is a scalable, high performance, low latency platform. It allows reading and writing streams of data like a messaging system.Cassandra: A distributed and wide-column Continue Reading

DevOps Shorts: How to increase the replication factor for a Kafka topic

Reading Time: 2 minutes Have you ever faced a situation where you had to increase the replication factor for a topic? Turns out it’s really easy to do it. In this super short blog, let’s try to do just that. We’d start with creating a topic, one, with a replication factor of just 1 and then work on bits that include creating the increase.json file and then actually triggering the plan. Step 1: Create Continue Reading

Creating Data Pipeline with Spark streaming, Kafka and Cassandra

Reading Time: 3 minutes Hi Folks!! In this blog, we are going to learn how we can integrate Spark Structured Streaming with Kafka and Cassandra to build a simple data pipeline. Spark Structured Streaming is a component of Apache Spark framework that enables scalable, high throughput, fault tolerant processing of data streams.Apache Kafka is a scalable, high performance, low latency platform that allows reading and writing streams of data Continue Reading