kafka

Alpakka – Connecting Kafka and ElasticSearch to Akka streams

In our previous blog, we had a look at what Akka streams are and how they are different from the other streaming mechanisms we have. In this blog, we will be taking a little step forward into the world of Akka Streams. In order to work with Akka streams, we need a mechanism to connect Akka Streams to the existing system components. That is where Alpakka Continue Reading

Exactly-Once Semantics with Apache Kafka

Kafka’s exactly once semantics was recently introduced with the version 0.11 which enabled the message being delivered exactly once to the end consumer even if the producer retries to send the messages. This major release raised many eyebrows in the community as people believed that this is not mathematically possible in distributed systems. Jay Kreps, Co-founder on Confluent, and Co-creator of Apache Kafka explained its Continue Reading

Kafka Streams

Interactive Queries in Apache Kafka

Apache Kafka v0.10 introduced a new feature Kafka Streams API – a client library which can be used for building applications and microservices, where the input and output data can be stored in Kafka clusters. Kafka Streams provides state stores, which can be used by stream processing applications to store and query data.  Every task in Kafka Streams uses one or more state stores which Continue Reading

Kafka And Spark Streams: The happily ever after !!

Hi everyone, Today we are going to understand a bit about using the spark streaming to transform and transport data between Kafka topics. The demand for stream processing is increasing every day. The reason is that often, processing big volumes of data is not enough. We need real-time processing of data especially when we need to handle continuously increasing volumes of data and also need Continue Reading

Joins in Kafka

Join Semantics in Kafka Streams

Introduction to core concepts:   Apache Kafka is a distributed streaming platform which enables you to publish and subscribe to a stream of records also letting you process this stream of records as it occurs. Kafka Streams is a client library used for building applications and microservices, where the input and output data are stored in Kafka clusters. Interface KStream<K, V> is an abstraction of Continue Reading

Basic Example for Spark Structured Streaming & Kafka Integration

The Spark Streaming integration for Kafka 0.10 is similar in design to the 0.8 Direct Stream approach. It provides simple parallelism, 1:1 correspondence between Kafka partitions and Spark partitions, and access to offsets and metadata. However, because the newer integration uses the new Kafka consumer API instead of the simple API, there are notable differences in usage. This version of the integration is marked as Continue Reading

RealTimeProcessing of Data using kafka and Spark

Before Starting it you should know about kafka, spark and what is Real time processing of Data.so let’s do some brief introduction about it. Real Time Processing – Processing the Data that appears to take place instead of storing the data and then processing it or processing the data that stored somewhere else. Kafka – Kafka is the maximum throughput of data from one end to another . Continue Reading

Unit Testing Of Kafka

Apache Kafka is a distributed publish-subscribe messaging system and a robust queue that can handle a high volume of data and enables you to pass messages from one end-point to another. Generally, data is published to topic via Producer API and  Consumers API consume data from subscribed topics. In this blog, we will see how to do unit testing of kafka. Unit testing your Kafka Continue Reading

Setting It Up: KAFKA Multi-Broker System

In this blog, I am going to cover up the leftovers of my last blog: “A Beginners Approach To KAFKA” in which I tried to explain the details of Kafka, like its terminologies, advantages and demonstrated like how to set up the Kafka environment and get our Single Broker Cluster up and then test it’s working. So the main thing that I am going to cover up here is How Continue Reading

A Beginners Approach To “KAFKA”

Heavy Data Load? Kafka Is Here For You. In this blog, I am going to get into the details like: What is Kafka? Getting familiar with Kafka. Learning some basics in Kafka. Creating a general Single Broker Cluster. So let’s get started. 1. What is Kafka? In simple terms, KAFKA is a messaging system that is designed to be fast, scalable, and durable. It is Continue Reading

Meetup: Stream processing using Kafka

Knoldus organized a Meetup on Friday, 7th April 2017 at 4:00 PM which was presented by Himani Arora and me(Prabhat Kashyap). Topics which were covered in this meetup: What is Stream processing Advantages of stream processing Type of stream processing What are KStreams Use cases of KStreams Overview of Kafka Connect Slides: Video Recording:

kafka with spark

Integrating Kafka With Spark Structure Streaming

Kafka is a messaging broker system which facilitates the passing of messages between producer and consumer whereas Spark Structure streaming consumes static and streaming data from various sources like kafka, flume, twitter or any other socket which can be processed and analysed using high level algorithm for machine learning and finally pushed the result out to external storage system. The main advantage of structured streaming Continue Reading

Streaming in Spark, Flink and Kafka

There is a lot of buzz going on between when to use use spark, when to use flink, and when to use Kafka. Both spark streaming and flink provides exactly once guarantee that every record will be processed exactly once thereby eliminating any duplicates that might be available. Both provide very high throughput compared to any other processing system like storm, and the overhead of Continue Reading

%d bloggers like this: