Play-Spark2 A simple Application

Table of contents
Reading Time: 3 minutes

In This Blog We Will Create  a very simple application with Play FrameWork And Spark.

Apache Spark is a fast and general-purpose cluster computing system. It provides high-level APIs in Scala, Java, and Python that make parallel jobs easy to write, and an optimized engine that supports general computation graphs. It also supports a rich set of higher-level tools including Shark and Spark Streaming.

Play on other hand is a lightweight, stateless, web-friendly architecture. It is built on Akka and provides predictable and minimal resource consumption (CPU, memory, threads) for highly-scalable applications.

Now we will start building an application using the above technologies.

We will try to create a Spark Application which is built in Play 2.5. We can build it in any Play version.

First of all, download the latest activator version using these steps,create a new play-scala project.

knoldus@knoldus:~/Desktop/activator/activator-1.3.12-minimal$ activator new
Fetching the latest list of templates...

Browse the list of templates: http://lightbend.com/activator/templates
Choose from these featured templates or enter a template name:
1) minimal-akka-java-seed
2) minimal-akka-scala-seed
3) minimal-java
4) minimal-scala
5) play-java
6) play-scala
(hit tab to see a list of all templates)
> 6
Enter a name for your application (just press enter for 'play-scala')
> play-spark
OK, application "play-spark" is being created using the "play-scala" template

import your project in intellij,inside your built.sbt file add following dependencies

libraryDependencies ++= Seq(
  jdbc,
  cache,
  ws,
  "org.scalatestplus.play" %% "scalatestplus-play" % "1.5.1" % Test,
  "com.fasterxml.jackson.core" % "jackson-core" % "2.8.7",
  "com.fasterxml.jackson.core" % "jackson-databind" % "2.8.7",
  "com.fasterxml.jackson.core" % "jackson-annotations" % "2.8.7",
  "com.fasterxml.jackson.module" %% "jackson-module-scala" % "2.8.7",
  "org.apache.spark" % "spark-core_2.11" % "2.1.1",
  "org.webjars" %% "webjars-play" % "2.5.0-1",
  "org.webjars" % "bootstrap" % "3.3.6",
  "org.apache.spark" % "spark-sql_2.11" % "2.1.1",
  "com.adrianhurt" %% "play-bootstrap" % "1.0-P25-B3"
)
libraryDependencies ~= { _.map(_.exclude("org.slf4j", "slf4j-log4j12")) }

Add these routes inside your routes file in conf package.

# Routes
# This file defines all application routes (Higher priority routes first)
# ~~~~

# Home page

# Map static resources from the /public folder to the /assets URL path
GET     /assets/*file               controllers.Assets.versioned(path="/public", file: Asset)

GET     /webjars/*file                    controllers.WebJarAssets.at(file)

GET     /locations/json       controllers.ApplicationController.index

Next you need to create a bootstrap package inside your app folder and create a class inside it which will  be the first class to be loaded in project and it will initialize the sparksession.

package bootstrap

import org.apache.spark.sql.SparkSession
import play.api._

object Init extends GlobalSettings {

  var sparkSession: SparkSession = _

  /**
   * On start load the json data from conf/data.json into in-memory Spark
   */
  override def onStart(app: Application) {
    sparkSession = SparkSession.builder
      .master("local")
      .appName("ApplicationController")
      .getOrCreate()

    val dataFrame = sparkSession.read.json("conf/data.json")
    dataFrame.createOrReplaceTempView("godzilla")
  }

  /**
   * On stop clear the sparksession
   */
  override def onStop(app: Application) {
    sparkSession.stop()
  }

  def getSparkSessionInstance = {
    sparkSession
  }
}

To make sure that this class is global in root package inside your application.conf add these lines

# This is the main configuration file for the application.
# ~~~~~

# Secret key
# ~~~~~
# The secret key is used to secure cryptographics functions.
# If you deploy your application to several instances be sure to use the same key!
application.secret="AuAS5FW52/uX9Uy]TDBWwG6e@A=X2bFtv2q_I>6<t@Y[VtJtTGXQEXoU5BouE]rk"

# The application languages
# ~~~~~
application.langs="en"

# Global object class
# ~~~~~
# Define the Global object class for this application.
# Default to Global in the root package.
application.global= bootstrap.Init

Inside your controller package, add a new controller and name it as ApplicationController this controller will read data from a json file using sparksession and then query on the data and later convert that data to json and send it to the view.

package controllers

import javax.inject.Inject

import scala.concurrent.Future

import org.apache.spark.sql.{DataFrame, SparkSession}
import play.api.i18n.MessagesApi
import play.api.libs.json.Json
import play.api.mvc.{Action, AnyContent, Controller}
import bootstrap.Init

class ApplicationController @Inject()(implicit webJarAssets: WebJarAssets,
    val messagesApi: MessagesApi) extends Controller {

  def index: Action[AnyContent] = { Action.async {
    val query1 =
      s"""
        SELECT * FROM godzilla WHERE date='2000-02-05' limit 8
      """.stripMargin

     val sparkSession = Init. getSparkSessionInstance
      sparkSession.sqlContext.read.json("conf/data.json")
      val result: DataFrame = sparkSession.sql(query1)
      val rawJson = result.toJSON.collect().mkString
      Future.successful(Ok(Json.toJson(rawJson)))

    }
  }
}

Now we are all set to start with our play application.

using activator run you can run your application and to query on the data you can use below curl request.

location

You can find the working demo available here.

KNOLDUS-advt-sticker

happy coding i hope this blog will help

Discover more from Knoldus Blogs

Subscribe now to keep reading and get access to the full archive.

Continue reading