Meetup: Reactive Programming using scala and akka


In this meetup which was a part of our ongoing knolx session, i talked about reactive programming using scala and akka.

Reactive programming is all about developing responsive applications built on the top of event-driven, resilient and scalable architecture.

Below are the knolx slides.

I have also shown some examples regarding brief introduction to scala and akka. Please find the github repository here.

Posted in Scala | Leave a comment

SBT console to debug application


In this blog post, We will know how to debug application via sbt console. Suppose we want to do some initialization process before debugging the application. For example, database connection, importing packages etc. sbt configuration provide a nice way to make debug process easier.

There are some steps to debug Liftweb application via sbt console. First we have to initialize the database connection or initialize application before debugging. For that we have to run class Boot’s function boot.

Welcome to Scala version 2.10.3 (Java HotSpot(TM) 64-Bit Server VM, Java 1.7.0_51).
Type in expressions to have them evaluated.
Type :help for more information.

scala> import bootstrap.liftweb.Boot
import bootstrap.liftweb.Boot

scala> new Boot().boot
INFO - MongoDB inited: localhost/127.0.0.1:27017/typesafe
scala> import code.model._
import code.model._

scala> import com.foursquare.rogue.LiftRogue._
import com.foursquare.rogue.LiftRogue._

scala> val score = Score.createRecord.examtype("homework").score(83.5)
score: code.model.Score = class code.model.Score={examtype=homework, score=83.5}

scala> val student = StudentInfo("Devid",List(score))
student: code.model.StudentInfo = StudentInfo(Devid,List(class code.model.Score={examtype=homework, score=83.5}))

scala> Student.createBy(student)
res3: code.model.Student = class code.model.Student={name=Devid, age=0, _id=53f04f2688e0d70d73c0fb50, scores=List(class code.model.Score={examtype=homework, score=83.5}), address=}

scala> Student.where(_.name eqs "Devid").fetch
res4: List1 = List(class code.model.Student={name=Devid, age=0, _id=53f04f2688e0d70d73c0fb50, scores=List(class code.model.Score={examtype=homework, score=83.5}), address=})
scala> 

Now we don’t want to import packages or database connection initialization manually so sbt configuration setting Define the initial commands evaluated when entering the Scala REPL. Just define initial commands in build.sbt

initialCommands in console := """
    import bootstrap.liftweb._
    import code.model._
    import org.bson.types.ObjectId
    import net.liftweb.common._
    import com.foursquare.rogue.LiftRogue._
    new.Boot().boot
     """

now again run the REPL:

abdhesh@abdhesh-Vostro-3560:~/Documents/projects/knoldus/Rogue_Query$ sbt console
[info] Loading project definition from /home/abdhesh/Documents/projects/knoldus/Rogue_Query/project
[info] Set current project to Rogue_Query (in build file:/home/abdhesh/Documents/projects/knoldus/Rogue_Query/)
[info] Starting scala interpreter...
[info] 
INFO - MongoDB inited: localhost/127.0.0.1:27017/typesafe
import bootstrap.liftweb._
import code.model._
import org.bson.types.ObjectId
import net.liftweb.common._
import com.foursquare.rogue.LiftRogue._
Welcome to Scala version 2.10.3 (Java HotSpot(TM) 64-Bit Server VM, Java 1.7.0_51).
Type in expressions to have them evaluated.
Type :help for more information.

scala> val score = Score.createRecord.examtype("homework").score(83.5)
score: code.model.Score = class code.model.Score={examtype=homework, score=83.5}

scala> val student = StudentInfo("Devid",List(score))
student: code.model.StudentInfo = StudentInfo(Devid,List(class code.model.Score={examtype=homework, score=83.5}))

scala> Student.createBy(student)
res3: code.model.Student = class code.model.Student={name=Devid, age=0, _id=53f04f2688e0d70d73c0fb50, scores=List(class code.model.Score={examtype=homework, score=83.5}), address=}

scala> Student.where(_.name eqs "Devid").fetch
res4: List1 = List(class code.model.Student={name=Devid, age=0, _id=53f04f2688e0d70d73c0fb50, scores=List(class code.model.Score={examtype=homework, score=83.5}), address=})

Now There is no need to run initialization process manually once you define initialCommands in build.sbt.

Posted in Scala | Leave a comment

How to flatten nested tuples in scala


In a project which I’ve been working on, I encountered a situation to flatten a nested tuple but couldn’t come up with a way to do so, hence out of curiosity I started googling about it and came to the following conclusion.

As for an example I had a structure something similar to the one mentioned below, though not identical:


val structureToOperateOn = List(List("a1","a2","a3"), List("b1","b2","b3") , List("c1","c2","c3"), List(10,1,11))

and, supposedly I wanted to make structureToOperateOn something like this:


"a1", "b1", "c1", 10
"a2", "b2", "c2", 1
"a3", "b3", "c3", 11

So the first thing that came to my mind was to use foldLeft:


val operatedStructure = (structureToOperateOn.tail.foldLeft(structureToOperateOn.head)((a,b) => a zip b)).asInstanceOf[List[(((String,String),String),Int)]]

which resulted in something like this:


List(((("a1","b1"),"c1"),10), ((("a2","b2"),"c2"),1), ((("a3","b3"),"c3"),11))

Next, I thought of flattening the tuples and came across Shapeless. Although I think scala should have something to flatten tuples, the best way it could be done as of now is to use Shapeless library. Anyways, this is how flattening tuples using Shapeless works:


import shapeless._
import shapeless.ops.tuple.FlatMapper
import syntax.std.tuple._

object NestedTuple {
trait LowPriorityFlatten extends Poly1 {
implicit def default[T] = at[T](Tuple1(_))
}

object flatten extends LowPriorityFlatten {
implicit def caseTuple[P <: Product](implicit fm: FlatMapper[P, flatten.type]) =
at[P](_.flatMap(flatten))
}

val structureToOperateOn = List(List("a1","a2","a3"), List("b1","b2","b3") , List("c1","c2","c3"), List(10,1,11))
val operatedStructure = (structureToOperateOn.tail.foldLeft(structureToOperateOn.head)((a,b) => a zip b)).asInstanceOf[List[(((String,String),String),Int)]]

val flattenedTuples = operatedStructure map (tuple => flatten(tuple))   // This should be List((a1,b1,c1,10), (a2,b2,c2,1), (a3,b3,c3,11))

}

After messing around with nested tuples, I finally thought it’d be better to have an alternative way to get the required result instead of adding a new library in the project. Regardless, it could be very helpful in scenarios where you might get stuck and would want to ultimately flatten a tuple.

This was what I used as an alternative:


val operatedStructure = structureToOperateOn.transpose

which resulted in:


List(List("a1", "b1", "c1", 10), List("a2", "b2", "c2", 1), List("a3", "b3", "c3", 11))

So to conclude, you can use Shapeless in order to flatten complex nested tuples if need be.

 

 

Posted in Scala | Tagged , , , | 5 Comments

Liftweb: Implement cache


In this blog post, I will explain how to integrate cache on server.
Liftweb Framework provide nice way to implement cache to store data(objects) on server so all user can access that data. Lift uses the LRU Cache wrapping org.apache.commons.collections.map.LRUMap

Create Object for handling cache operations like create,get,update and delete the data from in-memory cache.
LRUinMemoryCache.scala

import net.liftweb.util.{ LRU, Props }
import net.liftweb.common._

/**
 * LRU Cache wrapping org.apache.commons.collections.map.LRUMap
 */

object LRUinMemoryCache extends LRUinMemoryCache

class LRUinMemoryCache extends LRUCache[String] with Loggable {

  def size: Int = 10

  def loadFactor: Box[Float] = Empty

/**
*Here we are setting the data in-memory cache
*/
  def init: Unit = {
    set("inMemoryData", "here you can put whatever you want")
    logger.info("cache created")
  }
}

//size - the maximum number of Elements allowed in the LRU map
trait LRUCache[V] extends Loggable {

  def size: Int

  def loadFactor: Box[Float]

  private val cache: LRU[String, V] = new LRU(size, loadFactor)

  def get(key: String): Box[V] =
    cache.synchronized {
      cache.get(key)
    }

  def set(key: String, data: V): V = cache.synchronized {
    cache(key) = data
    data
  }

def update(key: String, data: V): V = cache.synchronized {
    cache.update(key, data)
    data
  }

  def has(key: String): Boolean = cache.synchronized {
    cache.contains(key)
  }

  def delete(key: String) = cache.synchronized(cache.remove(key))

}

Create and store the data in in-memory cache at the deployment time:
Boot.scala

package bootstrap.liftweb

import code.lib.LRUinMemoryCache

/**
 * A class that's instantiated early and run.  It allows the application
 * to modify lift's environment
 */
class Boot {
  def boot {
    //Init the in-memory cache
    LRUinMemoryCache.init
  }
}

Now we are accessing in-memory cache


//Get data from Cache:
LRUinMemoryCache.get("inMemoryData")

//Update in-memory cache
LRUinMemoryCache.update("inMemoryData","Updated data has been set")

//Remove data from in-memory cache
LRUinMemoryCache.delete("inMemoryData")

Posted in Scala | Leave a comment

Knolx Session : A Brief Introduction on LESS


In this Knolx session, I tried to explain various concept of LESS( a CSS pre-processor ). Here is the presentation :

Posted in CSS, JavaScript, Web | Leave a comment

Remote profiling using SSH Port Forwarding (SSH Tunneling) on Linux


In this blog post I’ll lay out few steps that are needed for remote profiling using SSH Port Forwarding (SSH Tunneling) using Yourkit profiler.
 
Steps to be followed on remote machine:
1) Download Yourkit profiler from official Yourkit website.
2) Extract the downloaded file anywhere.
3) What we need to do now is find the file named libyjpagent.so in the extracted folder corresponding to the system architecture of your remote machine. In my case it is located in yjp-2013-build-13086/bin/linux-x86-64.
4) Copy the libyjpagent.so file to any convenient location (if required), I copied it to /tmp/yjp folder.
5) Add the following VM option to the command line of your Java application:

-agentpath:/tmp/yjp/libyjpagent.so=port=7878

Here 7878 is the port that you will port forward to.
6) Your remote machine is all set for profiling.
 
Steps to be followed on local machine:
1) Port forward to remote machine by executing the following command on terminal:

ssh -N -f 1.2.3.4 -L 8085:1.2.3.4:7878

Explanation of what is going on in the above command:
-L 8085:1.2.3.4:7878 is forwarding port 7878 from 1.2.3.4 to localhost port 8085
-N -f parameters are forcing the ssh to go to background.
 
2) Download Yourkit profiler from official Yourkit website on you local machine as well.
3) Extract the downloaded file anywhere.
4) From terminal go to the following folder:
yjp-2013-build-13086/bin
and start Yourkit profiler using sh yjp.sh.
5) Now from yourkit profiler choose “Connect to remote application” option and enter localhost:8085 in the pop-up that asks for the link to your remote application.
6) Yourkit will start profiling your Java application on remote machine now.
 
 
 
Posted in Agile, Amazon EC2, Architecture | Tagged , , , , , | 2 Comments

Scala in Business | Knoldus Newsletter – July 2014


We are back again with July 2014, Newsletter. Here is this Scala in Business | Knoldus Newsletter – July 2014

In this newsletter, you will find that how industries are adopting Typesafe Reactive Platform for scaling their applications and getting benefits.

So, if you haven’t subscribed to the newsletter yet then make it hurry and click on Subscribe Monthly Scala News Letter

news

Posted in Scala | Leave a comment

Knolx Session: Gatling – Stress Test Tool


Posted in Scala | Leave a comment

Play with Spark: Building Spark MLLib in a Play Spark Application


In our last post of Play with Spark! series, we saw how to integrate Spark SQL in a Play Scala application. Now in this blog we will see how to add Spark MLLib feature in a Play Scala application.

Spark MLLib is a new component under active development. It was first released with Spark 0.8.0. It contains some common machine learning algorithms and utilities, including classification, regression, clustering, collaborative filtering, dimensionality reduction, as well as some optimization primitives. For detailed list of available algorithms click here.

To add Spark MLLib feature in a Play Scala application follow these steps:

1). Add following dependencies in build.sbt file

libraryDependencies ++= Seq(
"org.apache.spark"  %% "spark-core"              % "1.0.1",
"org.apache.spark"  %% "spark-mllib"             % "1.0.1"
)

The dependency – “org.apache.spark”  %% “spark-mllib” % “1.0.1” is specific to Spark MLLib.

As you can see that we have upgraded to Spark 1.0.1 (latest release of Apache Spark).

2). Create a file app/utils/SparkMLLibUtility.scala & add following code to it

package utils

import org.apache.spark.SparkContext
import org.apache.spark.SparkConf

import org.apache.spark.mllib.regression.LabeledPoint
import org.apache.spark.mllib.linalg.Vectors
import org.apache.spark.mllib.classification.NaiveBayes

object SparkMLLibUtility {

 def SparkMLLibExample {

 val conf = new SparkConf(false) // skip loading external settings
                .setMaster("local[4]") // run locally with enough threads
                .setAppName("firstSparkApp")
                .set("spark.logConf", "true")
                .set("spark.driver.host", "localhost")
 val sc = new SparkContext(conf)

 val data = sc.textFile("public/data/sample_naive_bayes_data.txt")    // Sample dataset
 val parsedData = data.map { line =>
 val parts = line.split(',')
 LabeledPoint(parts(0).toDouble, Vectors.dense(parts(1).split(' ').map(_.toDouble)))
 }
 // Split data into training (60%) and test (40%).
 val splits = parsedData.randomSplit(Array(0.6, 0.4), seed = 11L)
 val training = splits(0)
 val test = splits(1)

 val model = NaiveBayes.train(training, lambda = 1.0)
 val prediction = model.predict(test.map(_.features))

 val predictionAndLabel = prediction.zip(test.map(_.label))
 val accuracy = 1.0 * predictionAndLabel.filter(x => x._1 == x._2).count() / test.count()
 println("Accuracy = " + accuracy * 100 + "%")
 }
}

In above code we have used Naive Bayes algorithm as an example.

3). In above code you can notice that we have parsed data into Vectors object of Spark.

val parsedData = data.map { line =>
 val parts = line.split(',')
 LabeledPoint(parts(0).toDouble, Vectors.dense(parts(1).split(' ').map(_.toDouble)))
 }

Reason for using Vectors object of Spark instead of Vector class of Scala is that, Vectors object of Spark contains both Dense & Sparse methods for parsing both dense & sparse type of data. This allows us to analyze data according to its properties.

4). Next we observe that we have split data in 2 parts – 60% for training & 40% for testing.

// Split data into training (60%) and test (40%).
val splits = parsedData.randomSplit(Array(0.6, 0.4), seed = 11L)
val training = splits(0)
val test = splits(1)

5). Then we trained our model using Naive Bayes algorithm & training data.

val model = NaiveBayes.train(training, lambda = 1.0)

6). At last we used our model to predict the labels/class of test data.

 val prediction = model.predict(test.map(_.features))
 val predictionAndLabel = prediction.zip(test.map(_.label))
 val accuracy = 1.0 * predictionAndLabel.filter(x => x._1 == x._2).count() / test.count()
 println("Accuracy = " + accuracy * 100 + "%")

Then to find how good our model is, we calculated the Accuracy of the predicted labels.

So, we see that how easy it is to use any algorithm available in Spark MLLib to perform predictive analytics on data. For more information on Spark MLLib click here.

To download a Demo Application click here.

Posted in Agile, Play Framework, Scala, Spark | Tagged , , | 3 Comments

Play with Spark: Building Spark SQL in a Play Spark Application


In our last post of Play with Spark! series, we saw how to integrate Spark Streaming in a Play Scala application. Now in this blog we will see how to add Spark SQL feature in a Play Scala application.

Spark SQL is a powerful tool of Apache Spark. It allows relational queries, expressed in SQL, HiveQL, or Scala, to be executed using Spark. Apache Spark has a new type of RDD to support queries expressed in SQL format, it is SchemaRDD. A SchemaRDD is similar to a table in a traditional relational database.

To add Spark SQL feature in a Play Scala application follow these steps:

1). Add following dependencies in build.sbt file

libraryDependencies ++= Seq(
"org.apache.spark" %% "spark-core" % "1.0.0",
"org.apache.spark" %% "spark-sql"  % "1.0.0"
)

The dependency – “org.apache.spark”  %% “spark-sql” % “1.0.0” is specific to Spark SQL.

2). Create a file app/utils/SparkSQL.scala & add following code to it

package utils

import org.apache.spark.SparkContext
import org.apache.spark.SparkConf
import org.apache.spark.SparkContext._
import org.apache.spark.sql.SQLContext

case class WordCount(word: String, count: Int)

object SparkSQL {

 def simpleSparkSQLApp {
 val logFile = "public/README.md" // Should be some file on your system
 val driverHost = "localhost"
 val conf = new SparkConf(false) // skip loading external settings
                .setMaster("local[4]") // run locally with enough threads
                .setAppName("firstSparkApp")
                .set("spark.logConf", "true")
                .set("spark.driver.host", s"$driverHost")
 val sc = new SparkContext(conf)
 val logData = sc.textFile(logFile, 4).cache()
 val words = logData.flatMap(_.split(" "))

 val sqlContext = new SQLContext(sc)

 import sqlContext._

 val wordCount = words.map(word => (word,1)).reduceByKey(_+_).map(wc => WordCount(wc._1, wc._2))
 wordCount.registerAsTable("wordCount")

 val moreThanTenCount = wordCount.where('count > 10).select('word)

 println("Words occuring more than 10 times are : ")
 moreThanTenCount.map(mttc => "Word : " + mttc(0)).collect().foreach(println)

 }

}

Like any other Spark component, Spark SQL also runs on its own context. Here it is SQLContext. It runs on top of SparkContext. So, first we built sqlContext, so that we can use Spark SQL.

3). In above code you can notice that we have built a case class WordCount.

case class WordCount(word: String, count: Int)

This case class defines the Schema of Table in which we are going to store data in SQL format.

4). Next we observe that we have mapped variable wordCount to case class WordCount.

val wordCount = words.map(word => (word,1)).reduceByKey(_+_).map(wc => WordCount(wc._1, wc._2))
wordCount.registerAsTable("wordCount")

Here we are converting wordCount from RDD to SchemaRDD. Then we are registering it as a Table so that we can construct SQL queries to fetch data from it.

5). At last we notice that we have constructed a SQL query in Scala

val moreThanTenCounters = wordCount.where('count > 10).select('word)

Here we are fetching the words which occur more than 10 times in our text file. We have used Language-Integrated Relational Queries of Spark SQL which is available only in Scala. To know about other types of SQL queries supported by Spark SQL, click here.

To download a Demo Application click here.

Posted in Agile, Play Framework, Scala, Spark | Tagged , , | 1 Comment