Play with Spark: Building Spark SQL in a Play Spark Application

In our last post of Play with Spark! series, we saw how to integrate Spark Streaming in a Play Scala application. Now in this blog we will see how to add Spark SQL feature in a Play Scala application.

Spark SQL is a powerful tool of Apache Spark. It allows relational queries, expressed in SQL, HiveQL, or Scala, to be executed using Spark. Apache Spark has a new type of RDD to support queries expressed in SQL format, it is SchemaRDD. A SchemaRDD is similar to a table in a traditional relational database.

To add Spark SQL feature in a Play Scala application follow these steps:

1). Add following dependencies in build.sbt file

The dependency – “org.apache.spark”  %% “spark-sql” % “1.0.0” is specific to Spark SQL.

2). Create a file app/utils/SparkSQL.scala & add following code to it

Like any other Spark component, Spark SQL also runs on its own context. Here it is SQLContext. It runs on top of SparkContext. So, first we built sqlContext, so that we can use Spark SQL.

3). In above code you can notice that we have built a case class WordCount.

This case class defines the Schema of Table in which we are going to store data in SQL format.

4). Next we observe that we have mapped variable wordCount to case class WordCount.

Here we are converting wordCount from RDD to SchemaRDD. Then we are registering it as a Table so that we can construct SQL queries to fetch data from it.

5). At last we notice that we have constructed a SQL query in Scala

Here we are fetching the words which occur more than 10 times in our text file. We have used Language-Integrated Relational Queries of Spark SQL which is available only in Scala. To know about other types of SQL queries supported by Spark SQL, click here.

To download a Demo Application click here.

Written by 

Himanshu Gupta is a lead consultant having more than 4 years of experience. He is always keen to learn new technologies. He not only likes programming languages but Data Analytics too. He has sound knowledge of "Machine Learning" and "Pattern Recognition".He believes that best result comes when everyone works as a team. He likes listening to Coding ,music, watch movies, and read science fiction books in his free time.

2 thoughts on “Play with Spark: Building Spark SQL in a Play Spark Application

  1. I have been using this sample as a template for sqark sql queries , However I am trying to deploy it to a cluster as opposed to stand alone, has anyone had any success with this
    vr
    Hugh McBride

Leave a Reply

%d bloggers like this: