Category Archives: apache spark

Reading data from different sources using Spark 2.1


Hi all, In this blog, we’ll be discussing on fetching data from different sources like csv, json, text and parquet files. So first of all let’s discuss what’s new in Spark 2.1. In previous versions of Spark, you had to create … Continue reading

Posted in apache spark, sbt, Scala, Spark | Leave a comment

Spark Cassandra Connector On Spark-Shell


Using Spark-Cassandra-Connector on Spark Shell Hi All , In this blog we will see how we can execute our spark code on spark shell using Cassandra . This is very efficient at testing or learning time , where we have … Continue reading

Posted in apache spark, big data, Cassandra, Scala, Spark | 2 Comments

Introduction to Structured Streaming


Hello!!  Knoldus had organized half an hour session on Structured Streaming briefing about the API changes, how it is different from the early Stream Computation paradigm (DStreams) and example API demonstration. Hope you will enjoy. Below are the slides and Video … Continue reading

Posted in apache spark, Scala, Spark, Streaming | 1 Comment

Application compatibility for different Spark versions


Recently spark version 2.1 was released and there is a significant difference between the 2 versions. Spark 1.6 has DataFrame and SparkContext while 2.1 has Dataset and SparkSession. Now the question arises how to write code so that both the versions of … Continue reading

Posted in apache spark, Java, Scala, Spark | Tagged , , , , | 3 Comments

Twitter’s tweets analysis using Lambda Architecture


Hello Folks, In this blog i will explain  twitter’s tweets analysis with lambda architecture. So first we need to understand  what is lambda architecture,about its component and usage. According to Wikipedia, Lambda architecture is a data processing architecture designed to handle … Continue reading

Posted in Akka, akka-http, Apache Kafka, apache spark, Architecture, Batch, big data, Cassandra, Scala, Spark, Streaming | 5 Comments

Short Interview With SMACK Tech Stack !!!


Hello guy’s, today’s we conduct short interview with SMACK about its architecture and there uses. Let’s start with of some introduction. Interviewer: How would you describe your self ? SMACK: I am SMACK (Spark, Mesos, Akka, Cassandra and Kafka) and … Continue reading

Posted in Akka, Apache Kafka, apache spark, big data, Cassandra, Scala, Spark | Tagged , , , , , , , , , , , , | Leave a comment

Tableau: Getting into Tableau Public


Big Data visualization and Business Intelligence got so easy using Tableau, millions and billions of records can be analyzed in just one go whether your data format is excel, csv, text or database, Tableau make it easy for you. So … Continue reading

Posted in apache spark, big data, Scala, Spark, Tableau | Tagged , , , , , , , | Leave a comment

Finding the Impact of a Tweet using Spark GraphX


Social Network Analysis (SNA), a process of investigating social structures using Networks and Graphs, has become a very hot topic nowadays. Using it, we can answer many questions like: How many connections an individual have ? What is the ability … Continue reading

Posted in apache spark, big data, graph, Scala, Spark | Tagged , , , | 3 Comments

Cassandra with Spark 2.0 : Building Rest API !


In this tutorial , we will be demonstrating how to make a REST service in Spark using Akka-http as a side-kick  😉  and Cassandra as the data store. We have seen the power of Spark earlier and when it is … Continue reading

Posted in Akka, akka-http, apache spark, Cassandra, Scala, scalatest, Spark | Tagged , , , , , , , , , , , , , | 2 Comments

Spark – LDA : A Complete example of clustering algorithm for topic discovery.


In this blog we will be demonstrating the functionality of applying the full ML pipeline over a set of documents which in this case we are using 10 books from the internet. So lets start with first thing first.. What … Continue reading

Posted in apache spark, Scala, Spark | Tagged , , , , , , , , , , , , , , , , , , , , , , | 5 Comments