Author Archives: Himanshu Gupta

Spark Structured Streaming: A Simple Definition


“Structured Streaming”, nowadays we are hearing this term in Apache Spark ecosystem quite a lot, as it is being preached as next big thing in scalable big data world. Although, we all know that Structured Streaming means a stream having … Continue reading

Posted in Scala, Spark, Streaming | Tagged , , , , | 1 Comment

Apache Spark: 3 Reasons Why You Should Not Use RDDs


Apache Spark, whenever we hear these two words, the first thing that comes to our mind is RDDs, i.e., Resilient Distributed Datasets. Now, it has been more than 5 years since Apache Spark came into existence and after its arrival a lot … Continue reading

Posted in apache spark, big data, Scala, Spark | Tagged | 1 Comment

Hello React #2: Smallest Example in React


Hello Folks, In our previous blog post – Hello React #1: Creating a Single Page Application with React, we saw how to create a SPA with React. But, we didn’t got into its details. Here is the link for a … Continue reading

Posted in HTML, JavaScript, React, ReactJS | Tagged | Leave a comment

Hello React #1: Creating a Single Page Application with React


Few days ago we were looking for a JavaScript library which is flexible and can be used in a variety of projects. Basically, something with which we can create new apps, or introduce into an existing project without rewriting it. However, we came … Continue reading

Posted in JavaScript, Node.js, React, ReactJS | Tagged , | 1 Comment

Partition-Aware Data Loading in Spark SQL


Data loading, in Spark SQL, means loading data in memory/cache of Spark worker nodes. For which we use to write following code: val connectionProperties = new Properties() connectionProperties.put(“user”, “username”) connectionProperties.put(“password”, “password”) val jdbcDF = spark.read .jdbc(“jdbc:postgresql:dbserver”, “schema.table”, connectionProperties) In here we are … Continue reading

Posted in Scala, Spark | Tagged , , , | 7 Comments

Finding the Impact of a Tweet using Spark GraphX


Social Network Analysis (SNA), a process of investigating social structures using Networks and Graphs, has become a very hot topic nowadays. Using it, we can answer many questions like: How many connections an individual have ? What is the ability … Continue reading

Posted in apache spark, big data, graph, Scala, Spark | Tagged , , , | 3 Comments

KnolX: Introduction to Apache Spark 2.0


Knoldus organized a KnolX session on Friday, 23 September 2016. In that one hour session we got an introduction of Apache Spark 2.0 and its API(s). Spark 2.0 is a major release of Apache Spark. This release has brought many … Continue reading

Posted in Scala, Spark | Tagged , , , | 1 Comment

Spark Session: New Entry point in Spark 2.0


Finally, after a long wait, Apache Spark 2.0 got released on 26 July 2016, Tuesday. This release is built upon the feedback got from industry, in past two years, regarding Spark and its APIs. This means it has all what … Continue reading

Posted in Scala, Spark | Tagged , , | 3 Comments

Upgrade your Spark REST Server with Akka HTTP & Spark 2.0


About an year ago, in one of our blog – Spark with Spray Starter Kit we explained about creating REST Services with Spark and Spray. But, from past one year there has not been much development on Spray which tells us that … Continue reading

Posted in akka-http, Scala, Spark | Tagged , , | Leave a comment

Deploy a Spark Application on Cluster


In one of our previous blog, Setup a Apache Spark Cluster in your Single Standalone Machine, we showed how to setup a standalone cluster for running spark applications. But we never discussed on how to deploy our Spark applications on … Continue reading

Posted in Scala, Spark | Tagged , , , , | Leave a comment