Tag Archives: Spark Streaming

Basic Example for Spark Structured Streaming & Kafka Integration

The Spark Streaming integration for Kafka 0.10 is similar in design to the 0.8 Direct Stream approach. It provides simple parallelism, 1:1 correspondence between Kafka partitions and Spark partitions, and access to offsets and metadata. However, because the newer integration … Continue reading

Posted in Scala, Spark, Streaming | Tagged , | 3 Comments

Spark Streaming vs Kafka Stream

The demand for stream processing is increasing a lot these days. The reason is that often processing big volumes of data is not enough. Data has to be processed fast, so that a firm can react to changing business conditions … Continue reading

Posted in Apache Kafka, apache spark, big data, Scala, Streaming | Tagged , | 2 Comments

Getting Started with Apache Spark

Introduction Apache Spark is an open source big data processing framework built around speed, ease of use, and sophisticated analytics. It was originally developed in 2009 in UC Berkeley’s AMPLab, and open sourced in 2010 as an Apache project. Spark … Continue reading

Posted in apache spark, Scala, Spark | Tagged , , , , , , | 1 Comment

Streaming with Apache Spark Custom Receiver

Hello inqisitor. In previous blog we have seen about the predefined Stream receiver of Spark. In this blog we are going to discuss about Custom receiver of spark so that we can source the data from any . So if … Continue reading

Posted in apache spark, big data, Scala | Tagged , | 1 Comment

Streaming with Apache Spark 2.0

Hello geeks we were discussed about Apache Spark 2.0 with hive in earlier blog. Now i am going to describe how can we use spark to stream the data   . At first we need to understand this new Spark Streaming architecture … Continue reading

Posted in apache spark, big data, Scala | Tagged , | 2 Comments

MeetUp on “An Overview of Spark DataFrames with Scala”

Knoldus is organizing an one hour session on 18th Nov 2015 at 6:00 PM. Topic would be An Overview of Spark DataFrames with Scala. All of you are invited to join this session. Address:- 30/29, First Floor, Above UCO Bank, Near Rajendra … Continue reading

Posted in apache spark, Spark | Tagged , , , | Leave a comment

Spark Streaming Gnip :- An Apache Spark Utility to pull Tweets from Gnip in realtime

We all are familiar with Gnip, Inc. which provides data from dozens of social media websites via a single API. It is also known as the Grand Central Station for social media web. One of its popular API is PowerTrack … Continue reading

Posted in Agile, Scala, Spark | Tagged , , , , , , | 4 Comments

Stateful transformation on Dstream in apache spark with example of wordcount

Sometimes we have a use-case in which we need to maintain state of paired Dstream to use it in next Dstream . So we are taking a example of stateful wordcount in socketTextStreaming. Like in wordcount example if word “xyz” … Continue reading

Posted in apache spark, Scala, Spark | Tagged , , , , | 3 Comments

Meetup: Introduction to Spark with Scala

Knoldus organized a Meetup on Wednesday, 1 April 2015. In this Meetup, we gave a brief introduction of Spark with Scala. Apache Spark is a fast and general engine for large-scale data processing. A wide range of organizations are using it to process large datasets. Many … Continue reading

Posted in Agile, Scala, Spark | Tagged , , , , , , | 5 Comments

Play with Spark: Building Apache Spark with Play Framework – (Part – 2)

Last week, we saw how to build a Simple Spark Application in Play using Scala. Now in this blog we will see how to add Spark’s Twitter Streaming feature in a Play Scala application. Spark Streaming is a powerful tool … Continue reading

Posted in Agile, Akka, Future, Non-Blocking, Play Framework, Reactive, Scala, Spark, Web | Tagged , , , , | 1 Comment