Author Archives: Himanshu Gupta

Hello React: Creating a Single Page Application with React


Few days ago we were looking for a JavaScript library which is flexible and can be used in a variety of projects. Basically, something with which we can create new apps, or introduce into an existing project without rewriting it. However, we came … Continue reading

Posted in JavaScript, Node.js, React, ReactJS | Tagged , | Leave a comment

Partition-Aware Data Loading in Spark SQL


Data loading, in Spark SQL, means loading data in memory/cache of Spark worker nodes. For which we use to write following code: val connectionProperties = new Properties() connectionProperties.put(“user”, “username”) connectionProperties.put(“password”, “password”) val jdbcDF = spark.read .jdbc(“jdbc:postgresql:dbserver”, “schema.table”, connectionProperties) In here we are … Continue reading

Posted in Scala, Spark | Tagged , , , | 2 Comments

Finding the Impact of a Tweet using Spark GraphX


Social Network Analysis (SNA), a process of investigating social structures using Networks and Graphs, has become a very hot topic nowadays. Using it, we can answer many questions like: How many connections an individual have ? What is the ability … Continue reading

Posted in apache spark, big data, graph, Scala, Spark | Tagged , , , | 3 Comments

KnolX: Introduction to Apache Spark 2.0


Knoldus organized a KnolX session on Friday, 23 September 2016. In that one hour session we got an introduction of Apache Spark 2.0 and its API(s). Spark 2.0 is a major release of Apache Spark. This release has brought many … Continue reading

Posted in Scala, Spark | Tagged , , , | 1 Comment

Spark Session: New Entry point in Spark 2.0


Finally, after a long wait, Apache Spark 2.0 got released on 26 July 2016, Tuesday. This release is built upon the feedback got from industry, in past two years, regarding Spark and its APIs. This means it has all what … Continue reading

Posted in Scala, Spark | Tagged , , | 3 Comments

Upgrade your Spark REST Server with Akka HTTP & Spark 2.0


About an year ago, in one of our blog – Spark with Spray Starter Kit we explained about creating REST Services with Spark and Spray. But, from past one year there has not been much development on Spray which tells us that … Continue reading

Posted in akka-http, Scala, Spark | Tagged , , | Leave a comment

Deploy a Spark Application on Cluster


In one of our previous blog, Setup a Apache Spark Cluster in your Single Standalone Machine, we showed how to setup a standalone cluster for running spark applications. But we never discussed on how to deploy our Spark applications on … Continue reading

Posted in Scala, Spark | Tagged , , , , | Leave a comment

KnolX: Unit Testing of Spark Applications


Knoldus organized a KnolX session on Wednesday, 13 April 2016. In this KnolX session, we explored the different methods of writing unit tests for Spark applications. This session also talks about how unit testing of Spark applications is done, as well … Continue reading

Posted in Scala, Spark | Tagged , , , , | Leave a comment

Boost Factorial Calculation with Spark


We all know that, Apache Spark is a fast and a general engine for large-scale data processing. It can process data up to 100x faster than Hadoop MapReduce in memory, or 10x faster on disk. But, is that the only task … Continue reading

Posted in Scala, Spark | Tagged , | 5 Comments

Saving Spark DataFrames on Amazon S3 got Easier !!!


In our previous blog post, Congregating Spark Files on S3, we explained that how we can Upload Files(saved in a Spark Cluster) on Amazon S3. Well, I agree that the method explained in that post was a little bit complex and hard to apply. Also, … Continue reading

Posted in Amazon, Scala, Spark | Tagged , , , , | 2 Comments