Reading Time: < 1 minute Hello folks, Knoldus organized a knolx session on the topic : Lambda Architecture with Spark. The presentation covers lambda architecture and implementation with spark.In the presentaion we will discuss components of lambda architecure like batch layer,speed layer and serving layer.We will also discuss it’s advantages and benefits with spark. You can watch the video of presentation : Here you can check slide : Thanks !!
Reading Time: < 1 minute Knoldus organized a Meetup on Friday, 9 September 2016. Topics which were covered in this meetup are: Overview of Spark Streaming. Fault-tolerance Semantics & Performance Tuning. Spark Streaming Integration with Kafka. Meetup code sample available here Real time stream processing engine application code available here
Reading Time: 5 minutes In this blog , I will share my experience on building scalable, distributed and fault-tolerant Analytics engine using Scala, Akka, Play, Kafka and ElasticSearch. I would like to take you through the journey of building an analytics engine which was primarily used for text analysis. The inputs were structured, unstructured and semi-structured data and we were doing a lot of data crunching using it. The Analytics Continue Reading
Reading Time: 6 minutes In the last two blogs on Flink, I hope to have been able to underline the primacy of Windows in the scheme of things of Apache Flink’s streaming. I have shared my understanding of two types of Windows that can be attached to a stream of Events, namely (a) CountWindow and (b) TimeWindow. Variations of these types are offered too; for example, one can put Continue Reading
Reading Time: 6 minutes From the preceding post in this series In the last blog , we had taken a look at Flink’s CountWindow feature. Here’s a quick recap: As a stream of events enter a Flink-based application, we can apply a transformation of CountWindow on it (there are many such transformations the Flink offers us, we will meet them as we go). CountWindow allows us to create a Continue Reading
Reading Time: 7 minutes Of late, I have begun to read about Apache Flink. Apache Flink (just Flink hereafter), is an ‘open source platform for distributed stream and batch data processing’, to quote from the homepage. What has caught my interest is Flink’s idea that, the ability operate on unit of data streaming in gives one the flexibility to decide what constitutes a batch: count of events or events Continue Reading