stateful streaming

Spark Structured Streaming (Part 4) – Handling Late Data

Reading Time: 3 minutes Welcome back folks to this blog series of Spark Structured Streaming. This blog is the continuation of the earlier blog “Understanding Stateful Streaming“. And this blog pertains to Handling Late Arriving Data in Spark Structured Streaming. So let’s get started. Handling Late Data With window aggregates (discussed in the previous blog) Spark automatically takes cares of late data. Every aggregate window is like a bucket Continue Reading

Spark Structured Streaming (Part 3) – Stateful Streaming

Reading Time: 4 minutes Welcome back folks to this blog series of Spark Structured Streaming. This blog is the continuation of the earlier blog “Internals of Structured Streaming“. And this blog pertains to Stateful Streaming in Spark Structured Streaming. So let’s get started. Let’s start from the very basic understanding of what is Stateful Stream Processing. But to understand that, let’s first understand what Stateless Stream Processing is. In Continue Reading

Stateful Streaming in Spark

Reading Time: 4 minutes Apache Spark is a fast and general-purpose cluster computing system. In Spark, we can do the batch processing and stream processing as well. It does near real-time processing. It means that it processes the data in micro-batches. I have discussed more Spark Streaming in my previous blog. Now in this blog, I’ll discuss Stateful Streaming in Spark. So let’s start !! What is Stateful Streaming? Continue Reading