kafka

Interactive Queries in Apache Kafka

    Apache Kafka v0.10 introduced a new feature Kafka Streams API – a client library which can be used for building applications and microservices, where the input and output data can be stored in Kafka clusters. Kafka Streams provides state stores, which can be used by stream processing applications to store and query data.  Every task in Kafka Streams uses one or more state Continue Reading

Kafka And Spark Streams: The happily ever after !!

Hi everyone, Today we are going to understand a bit about using the spark streaming to transform and transport data between Kafka topics. The demand for stream processing is increasing every day. The reason is that often, processing big volumes of data is not enough. We need real-time processing of data especially when we need to handle continuously increasing volumes of data and also need Continue Reading

Join Semantics in Kafka Streams

Introduction to core concepts:   Apache Kafka is a distributed streaming platform which enables you to publish and subscribe to a stream of records also letting you process this stream of records as it occurs. Kafka Streams is a client library used for building applications and microservices, where the input and output data are stored in Kafka clusters. Interface KStream<K, V> is an abstraction of Continue Reading

Basic Example for Spark Structured Streaming & Kafka Integration

The Spark Streaming integration for Kafka 0.10 is similar in design to the 0.8 Direct Stream approach. It provides simple parallelism, 1:1 correspondence between Kafka partitions and Spark partitions, and access to offsets and metadata. However, because the newer integration uses the new Kafka consumer API instead of the simple API, there are notable differences in usage. This version of the integration is marked as Continue Reading

RealTimeProcessing of Data using kafka and Spark

Before Starting it you should know about kafka, spark and what is Real time processing of Data.so let’s do some brief introduction about it. Real Time Processing – Processing the Data that appears to take place instead of storing the data and then processing it or processing the data that stored somewhere else. Kafka – Kafka is the maximum throughput of data from one end to another . Continue Reading

Unit Testing Of Kafka

Apache Kafka is a distributed publish-subscribe messaging system and a robust queue that can handle a high volume of data and enables you to pass messages from one end-point to another. Generally, data is published to topic via Producer API and  Consumers API consume data from subscribed topics. In this blog, we will see how to do unit testing of kafka. Unit testing your Kafka Continue Reading

Setting It Up: KAFKA Multi-Broker System

In this blog, I am going to cover up the leftovers of my last blog: “A Beginners Approach To KAFKA” in which I tried to explain the details of Kafka, like its terminologies, advantages and demonstrated like how to set up the Kafka environment and get our Single Broker Cluster up and then test it’s working. So the main thing that I am going to cover up here is How Continue Reading

A Beginners Approach To “KAFKA”

Heavy Data Load? Kafka Is Here For You. In this blog, I am going to get into the details like: What is Kafka? Getting familiar with Kafka. Learning some basics in Kafka. Creating a general Single Broker Cluster. So let’s get started. 1. What is Kafka? In simple terms, KAFKA is a messaging system that is designed to be fast, scalable, and durable. It is Continue Reading

Meetup: Stream processing using Kafka

Knoldus organized a Meetup on Friday, 7th April 2017 at 4:00 PM which was presented by Himani Arora and me(Prabhat Kashyap). Topics which were covered in this meetup: What is Stream processing Advantages of stream processing Type of stream processing What are KStreams Use cases of KStreams Overview of Kafka Connect Slides: Video Recording:

Integrating Kafka With Spark Structure Streaming

Kafka is a messaging broker system which facilitates the passing of messages between producer and consumer whereas Spark Structure streaming consumes static and streaming data from various sources like kafka, flume, twitter or any other socket which can be processed and analysed using high level algorithm for machine learning and finally pushed the result out to external storage system. The main advantage of structured streaming Continue Reading

Streaming in Spark, Flink and Kafka

There is a lot of buzz going on between when to use use spark, when to use flink, and when to use Kafka. Both spark streaming and flink provides exactly once guarantee that every record will be processed exactly once thereby eliminating any duplicates that might be available. Both provide very high throughput compared to any other processing system like storm, and the overhead of Continue Reading

Introducing Kafka Streams: Processing made easy

If you are working on huge amount of data, you might have heard about Kafka. At a very high level, Kafka is a fault tolerant, distributed publish-subscribe messaging system that is designed for fast processing of data and the ability to handle hundreds of thousands of messages. What is Stream Processing Stream processing is the real-time processing of data continuously, concurrently, and in a record-by-record Continue Reading

Message Broker in Lagom using Kafka

What is Lagom? Lagom framework helps in simplifying the development of microservices by providing an integrated development environment. This benefits one by allowing them to focus on solving business problems instead of wiring services together. Lagom exposes two APIs, Java and Scala, and provides a framework and development environment as a set of libraries and build tool plugins. The supported build tools with Lagom are Maven Continue Reading

%d bloggers like this: