Category Archives: Streaming

Integrating Kafka With Spark Structure Streaming


Kafka is a messaging broker system which facilitates the passing of messages between producer and consumer whereas Spark Structure streaming consumes static and streaming data from various sources like kafka, flume, twitter or any other socket which can be processed … Continue reading

Posted in Apache Kafka, apache spark, Scala, Streaming | Tagged , , | 1 Comment

Spark Structured Streaming: A Simple Definition


“Structured Streaming”, nowadays we are hearing this term in Apache Spark ecosystem quite a lot, as it is being preached as next big thing in scalable big data world. Although, we all know that Structured Streaming means a stream having … Continue reading

Posted in Scala, Spark, Streaming | Tagged , , , , | 1 Comment

Exploring Spark Structured Streaming


Hello Spark Enthusiasts, Streaming apps are growing more complex. And it is getting difficult to do with current distributed streaming engines. Why streaming is hard ? Streaming computations don’t run in isolation. Data arriving out of time order is a … Continue reading

Posted in apache spark, Scala, Streaming | Tagged , | Leave a comment

Spark Streaming vs Kafka Stream


The demand for stream processing is increasing a lot these days. The reason is that often processing big volumes of data is not enough. Data has to be processed fast, so that a firm can react to changing business conditions … Continue reading

Posted in Apache Kafka, apache spark, big data, Scala, Streaming | Tagged , | 1 Comment

Streaming in Spark, Flink and Kafka


There is a lot of buzz going on between when to use use spark, when to use flink, and when to use Kafka. Both spark streaming and flink provides exactly once guarantee that every record will be processed exactly once … Continue reading

Posted in Apache Flink, Apache Kafka, apache spark, Streaming | Tagged , , , | Leave a comment

Introducing Kafka Streams: Processing made easy


If you are working on huge amount of data, you might have heard about Kafka. At a very high level, Kafka is a fault tolerant, distributed publish-subscribe messaging system that is designed for fast processing of data and the ability … Continue reading

Posted in Java, big data, Streaming | Tagged , , , | 1 Comment

Introduction to Structured Streaming


Hello!!  Knoldus had organized half an hour session on Structured Streaming briefing about the API changes, how it is different from the early Stream Computation paradigm (DStreams) and example API demonstration. Hope you will enjoy. Below are the slides and Video … Continue reading

Posted in apache spark, Scala, Spark, Streaming | 1 Comment

Twitter’s tweets analysis using Lambda Architecture


Hello Folks, In this blog i will explain  twitter’s tweets analysis with lambda architecture. So first we need to understand  what is lambda architecture,about its component and usage. According to Wikipedia, Lambda architecture is a data processing architecture designed to handle … Continue reading

Posted in Akka, akka-http, Apache Kafka, apache spark, Architecture, Batch, big data, Cassandra, Scala, Spark, Streaming | 6 Comments

Lambda Architecture with Spark


Hello folks, Knoldus  organized a knolx session on the topic : Lambda Architecture with Spark. The presentation covers lambda architecture and implementation with spark.In the presentaion we will discuss components of lambda architecure like batch layer,speed layer and serving layer.We will … Continue reading

Posted in Akka, akka-http, Cassandra, Scala, Spark, Streaming | Tagged | 2 Comments

Meetup: Stream Processing Using Spark & Kafka


Knoldus organized a Meetup on Friday, 9 September 2016. Topics which were covered in this meetup are: Overview of Spark Streaming. Fault-tolerance Semantics & Performance Tuning. Spark Streaming Integration with  Kafka. Meetup code sample available here Real time stream processing … Continue reading

Posted in Apache Kafka, apache spark, Best Practices, big data, Elasticsearch, Scala, Spark, Streaming | 1 Comment