Flink

Reading Avro files using Apache Flink

Reading Time: 2 minutes In this blog, we will see how to read the Avro files using Flink. Before reading the files, let’s get an overview of Flink. There are two types of processing – batch and real-time. Batch Processing: Processing based on the data collected over time. Real-time Processing: Processing based on immediate data for an instant result. Real-time processing is in demand and Apache Flink is the Continue Reading

Comparison between different streaming engines

Reading Time: 5 minutes Distributed stream processing engines have been on the rise in the last few years, first Hadoop became popular as a batch processing engine, then focus shifted towards stream processing engines. Stream processing engines can make the job of processing data that comes in via a stream easier than ever before and by using clustering can enable processing data in larger sets in a timely manner. Continue Reading

Flinkathon: What makes Flink better than Kafka Streams?

Reading Time: 2 minutes Initially, I would like you all to focus on a few questions before comparing the frameworks:1. Is there any comparison or similarity between Flink and the Kafka?2. What could be better in Flink over the Kafka?3. Is it the problem or system requirement to use one over the other? Before talking about the Flink betterment and use cases over the Kafka, let’s first understand their Continue Reading

Streaming in Spark, Flink and Kafka

Reading Time: 7 minutes There is a lot of buzz going on between when to use use spark, when to use flink, and when to use Kafka. Both spark streaming and flink provides exactly once guarantee that every record will be processed exactly once thereby eliminating any duplicates that might be available. Both provide very high throughput compared to any other processing system like storm, and the overhead of Continue Reading

Is Flink the shiny(err..) toy on the block?

Reading Time: 3 minutes If you are following the Big Data space especially from a Scala Space perspective then you would have noticed a troll of blogs, tweets and more blogs comparing the two. The two being Spark and Flink. That said, you would also find comparing these two with Samza and Storm. Incidentally all of them are top level Apache projects. For the purpose of this blog, let us stick Continue Reading