Introduction to Chef

Hi all,

Knoldus has organized an one hour session on 10th Feb 2016 at 5:00 PM. Topic was Introduction to Chef. Many people have joined and enjoyed the session. I am going to share the slides here. Please let me know if you have any question related to linked slides.

Posted in Scala | 1 Comment

SASS is often preferred as critically important stylesheet for styling webpage

color-1c4aab2bSASS is a style sheet language that is interpreted into Cascading Style Sheets (CSS). SASS Script is the scripting language itself. SASS offers what its name defines, “Syntactically Awesome Stylesheets”. It was designed by Hampton Catlin and developed by Natalie Weizenbaum. SASS is compatible with all versions of CSS and can be used with any CSS library. SASS is used just as syntactic sugar to CSS and translates its script to CSS at compile time. SASS defines two types of syntax,

  • Indexed Syntax or a key syntax(SASS)
  • CSS Styling Syntax or rules definition syntax(SCSS)

Indexed Syntax or a key syntax(SASS)

Indexed Syntax or a key syntax uses proper indentation to separate out its rules and newline characters to put line among them. This file is created with extension(.sass).

CSS Styling Syntax or rules definition syntax(SCSS)

CSS Styling Syntax or rules definition syntax defines the a set of CSS styling codes and proper values associated with them. It uses a newline character to put line in block of code and curly braces to separate out the blocks of code. This file is created with extension(.scss).

Numerous datatypes by SASS

SASS supports four kind of datatypes,

  • Numbers, it can include units as well
  • Strings, it can be with and without quotes
  • Colors, it can have a name or names of colors
  • Boolean

Numerous features by SASS

As SASS functions as scripting language, it provides following features,


In such conditions we want the information to be reused on the web page for styling, we use the variables for storing them. In SASS the variables are created by a $ symbol

$errorMessage : red
$errorFont : Helvetica, sans-serif

	font: 50% $errorFont
	color: $errorMessage

Here above is code written in SASS

$errorMessage : red;
$errorFont : Helvetica, sans-serif;

body {
	font: 50% $errorFont;
	color: $errorMessage;

And now above is written in SCSS


As styling is done on HTML components, we know that HTML components and tags follows a particular hierarchy with a clear nesting of blocks. On the other hand CSS, which is providing the styling to it, does not follow this hierarchy. SASS serves with a nested CSS selectors that provides same visual hierarchy as of HTML language.

In SASS we can generate a hierarchy to set rows and columns of a table as,


		font: 50% $headingFont
		color: $headingMessage

		font: 50% $cellFont
		color: $cellMessage

While in CSS we are required to write the same as,

table tr {
	font: 50% $headingFont;
	color: $headingMessage;
table td {
	font: 50% $cellFont;
	color: $cellMessage;

So on reviewing both the code blocks, the SASS is following a proper hierarchy.

So now by following these features provided by SASS we can style our code in a proper formatted way.

Happy Blogging !!!

Posted in HTML, Scala | Tagged , , , , , , , , | Leave a comment

A sample ML Pipeline for Clustering in Spark

Often a machine learning task contains several steps such as extracting features out of raw data, creating learning models to train on features and running predictions on trained models, etc.  With the help of the pipeline API provided by Spark, it is easier to combine and tune multiple ML algorithms into a single workflow.

Whats is in the blog?

We will create a sample ML pipeline to extract features out of raw data and apply K-Means Clustering algorithm to group data points.

The code examples used in the blog can be executed on spark-shell running Spark 1.3 or higher.

Basics of Spark ML pipeline API


DataFrame is a Spark SQL datatype which is used as Datasets in ML pipline. A Dataframe allows storing structured data into named columns. A Dataframe can be created from structured data files, Hive tables, external databases, or existing RDDs.


A Transformer converts a Dataframe into another Dataframe  with one or more added features to it. e.g. OneHotEncoder transforms a column with a label index into a column of vectored features. Every Transformer has a method transform() which is called to transform a Dataframe into another.


An Estimator is a learning algorithms which learns from the training data. Estimators accept a data set to be trained on and produces a model which is a transformer. e.g. K Means is an estimator which accept a training Dataframe and produces a K Means model which is a transformer. Every estimator has method fit() which invoked to learn from the data.


Each pipeline consists of an array of stages where each stage is either an Estimator or a Transformer which operate on Dataframes.

Input Dataset for Pipeline

We will use the following sample Dataframe as our input data. Each row in the Dataframe represents a customer with attributes: email, income and gender.

val input = sqlContext.createDataFrame(Seq(
 ("", 12000,"M"),
 ("", 43000,"M"),
 ("", 5000,"F"),
 ("", 60000,"M")
)).toDF("email", "income","gender")

The aim is to cluster this Dataset into similar groups using K-Means clustering algorithm available in Spark MLlib. The sequence of task involves:

  1. Converting categorical attribute labels into label indexes
  2. Converting categorical label indexes into numerical vectors
  3. Combining all numerical features into a single feature vector
  4. Fitting a K-Means model on the extracted features
  5. Predicting using K-Means model to get clusters for each data row

Creating Pipeline of Tasks:

The following code creates a Pipeline with StringIndexer, OneHotEncoder, VectorAssembler and KMeans as a sequence of stages to accomplish the above mentioned tasks.

import{OneHotEncoder, StringIndexer}

val indexer = new StringIndexer().setInputCol("gender").setOutputCol("genderIndex")
val encoder = new OneHotEncoder().setInputCol("genderIndex").setOutputCol("genderVec")
val assembler = new VectorAssembler().setInputCols(Array("income","genderVec")).setOutputCol("features")
val kmeans = new KMeans().setK(2).setFeaturesCol("features").setPredictionCol("prediction")

val pipeline = new Pipeline().setStages(Array(indexer, encoder, assembler, kmeans))

val kMeansPredictionModel =

val predictionResult = kMeansPredictionModel.transform(input)


Pipeline stages and Output:

We have the following pipeline stages generated out of the above code. At each stage, the Dataframe is transformed and becomes the input to the next stage :-


Pipeline stages

Posted in apache spark, big data, Scala, Spark | Tagged , , , , | Leave a comment

Saving Spark DataFrames on Amazon S3 got Easier !!!

In our previous blog post, Congregating Spark Files on S3, we explained that how we can Upload Files(saved in a Spark Cluster) on Amazon S3. Well, I agree that the method explained in that post was a little bit complex and hard to apply. Also, it adds a lot of boilerplate in our code.

So, we started working on simplifying it & finding an easier way to provide a wrapper around Spark DataFrames, which would help us in saving them on S3. And the solution we found to this problem, was a Spark package: spark-s3. It made saving Spark DataFrames on S3 look like a piece of cake, which we can see from the code below:


The code itself explains that now we don’t have to put any extra effort in saving Spark DataFrames on Amazon S3. All, we need to do is include spark-s3 in our project dependencies and we are done.

Right now spark-s3 supports only Scala & Java APIs, but we are working on providing support for Python and R too. So, stay tuned !!!

To know more about it, please read its documentation on GitHub.

Posted in Amazon, Scala, Spark | Tagged , , , , | Leave a comment

Essentials and future of web development, HAML, Jade, Emmet & Slim.

jade slim haml emmetEarlier in my previous blog post, we have gone through why haml is taking lead over simple HTML syntax, but some geeks (specially over linkedIn group “HTML5 / CSS3 / Javascript”) shared their thoughts and compared other engines. Some of them i was aware and some of them was new to me too, So in this post i tried covering the major topics with some practical guide. Hopefully this could give you some insights.

Mentioning some techies whom i tried to answer in this post:

Michael Mikowski, Dejan Ristic, Patrick de Bie, Rémi Rémino, Jeffrey Gochin, Joshua Barker, Felix Deimling, Alan Reid, Ilton Sequeira, Marcos Méndez Filesi, Luca Cavallin, Nicolai Moraru, Koen Cornelis, Nejc Vukovic, Jory Braun, Eddie Ebeling, Vladimir Parchev, Dong Zhu, Riccardo Ratini, Riccardo Ratini, Martin Rios Reynoso, Gabriel Meola, Kartik Jagdale.

In previous post we saw the syntax and usage of HAML, so a bit intro about how to compile and practically usage guide of acquiring HAML.


As Haml requires Ruby to be compiled to HTML, so the first step to using it is to ensure that Ruby is installed. Confirm whether Ruby is installed then geminstall haml command needs to be run from the command line, using Terminal or the alike command line program, to install Haml.

gem install haml

Files written in the Haml markup should be saved with the file extension of .haml. To convert these files from Haml to HTML the haml command below needs to be run to compile each individual file.

haml index.haml index.html

Above, the file index.haml is converted to HTML and saved as index.html within the same directory. This command has to be run within the same directory the files reside in. Should the command be run outside this directory the path where the files reside need to be included within the command. At any time the command haml --help may be run to see a list of different available options.

Watching a File or Directory

Haml doesn’t provide a way to watch a file, or directory, for changes but using of another dependency we can do so.

Inside of a Rails application a Haml dependency may be added in the Gemfile, thus automatically compiling Haml files to HTML upon any changes. There are a few desktop applications available for those not using Rails, one of the more popular being CodeKit.

Something about SLIM:

As Haml and Jade, slim is also allows you to write very minimal templates which are easy to maintain and make you sure about producing well-formatted HTML and XML.

Definitely, SLIM is faster and that differentiate is from other engines.

Slim is a fast, lightweight templating engine with support for Rails 3 and 4.

How to start?

Install Slim as a gem:

gem install slim

Include Slim in your Gemfile with gem 'slim' or require it with require 'slim'. That’s all! Just use the .slim extension and it’s good to go.

Haml & Slim

I got to know, how slim is better then haml like,

Slim is about 8 times faster than Haml and Slim supports HTTP streaming

But these points comes into consideration while creating full scalable webapp, but not basic applications.

HTML is perfect if you just need a simple website. But if you’re getting into more complicated projects a templating language is great – especially if you already use SCSS or LESS.

All templating languages require time so expect to set aside practice sessions. Ultimately these are all great choices and your decision should be based on which processor you feel most comfortable using. If you like Ruby then Haml is a great choice. If you like Node.js then Jade is also fantastic.

Analysis of Slim vs. Haml Project Health

Analyze here


An alternative to writing Haml/Jade is to use an Emmett plugin for your text editor, which takes syntax like Haml/Jade but will turn it into HTML which you save. If you’re writing a lot of HTML, but don’t want  to learn Haml/Jade, Emmett could be a good option.

Emmet support for ATOM

You can install the latest Emmet version manually from console:

cd ~/.atom/packages
git clone
cd emmet-atom
npm install

Learn Official emmet usage : here


Jade – robust, elegant, feature rich template engine for Node.js

With Jade, you can create config files, “blocks” of code, inheritance of different files using “include”, and most interestingly effective of all, usage of javascript code seamlessly on HTML pages.

Jade is designed primarily for server side templating in node.js, however it can be used in many other environments. It is only intended to produce XML like documents (HTML, RSS etc.) so don’t use it to create plain text/markdown/CSS/whatever documents.


Go through the Jade Syntax Documentation (30 min). here

Setting up jade in the project: Standalone Jade Usage


I wrote “and” between them, don’t know but i don’t wanted to write “vs” in between.

Jade is a high performance template engine heavily influenced by Haml and implemented with JavaScript for node and browsers.




Programming language

  • JavaScript/Node.js
  • Ruby

Scripting language support

  • JavaScript
  • php
  • Scala
  • Java
  • NodeJS
  • Java (any JVM scripting language)
  • Python
  • CoffeeScript
  • JavaScript
  • Ruby
  • php

In this small post its not possible to define all engines briefly but it contains the intro and their usage, Your comments and suggestions are welcome.

Hope you liked it!!

Posted in emmet, HAML, HTML, Jade, Scala, slim, Web, Web Designing | Tagged , , , , | Leave a comment

How to tokenize your search by N-Grams using Elastic Search in Scala?

NGrams can be used to search big data with compound words. German language is famous and referred for combining several small words into one massive compound word in order to capture precise or complex meanings.

N-Grams are the fragments in which a word is broken, and as more number of fragments relevant to data, the more fragments will match.N-Grams has its length of fragment as min_gram and max_gram, a trigram(length of 3) is a good length to start with.

For Example we have following words,

	Meaning : Pronunciation dictionary
	Meaning : World Health Organization
	Meaning : White-headed sea eagle, or bald eagle

Setup the Index

Now we put an index with the command,

curl -XPUT 'localhost:9200/dictionary' -d '{
    "settings": {
        "analysis": {
            "filter": {
                "trigrams_filter": {
                    "type":     "ngram",
                    "min_gram": 3,
                    "max_gram": 3
            "analyzer": {
                "trigrams": {
                    "type":      "custom",
                    "tokenizer": "standard",
                    "filter":   [
    "mappings": {
        "germanDictionary": {
            "properties": {
                "text": {
                    "type":     "string",
                    "analyzer": "trigrams" 

Insert bulk of Documents

curl -XPOST 'localhost:9200/dictionary/germanDictionary/_bulk?pretty' -d '
{ "index":{"_id":"1"} }
{ "text": "Aussprachewörterbuch" }
{ "index":{"_id":"2"} }
{ "text": "Weltgesundheitsorganisation" }
{ "index":{"_id":"3"} }
{ "text": "Weißkopfseeadler" }'

Now the Index dictionary is created with type germanDictionary and contains three documents as created in bulk.

Applying search with N-Grams

If we search with wörterbuch we should get the Aussprachewörterbuch as result, by the command,

curl -XGET ‘http://localhost:9200/dictionary/germanDictionary/_search?q=text:Wörterbuch’

Elastic Search will return the following response with the source that matched with token wörterbuch,


Now if we search with Adler we should get the Weißkopfseeadler as result, by the command,

curl -XGET ‘http://localhost:9200/dictionary/germanDictionary/_search?q=text:Adler

Elastic Search will return the following response with the source that matched with token Adler,


Finally if we search with er we should not get any successfull result, by the command,

curl -XGET ‘http://localhost:9200/dictionary/germanDictionary/_search?q=text:er

This time Elastic Search will return the following response,


Now here we compare this result with the results we got in above two search queries,

  • The total number of documents found is zero
  • Maximum score(max_score) is null
  • Hits is a blank array

Why did Elastic Search return this response as result ?

The answer is because of this filter,

    "trigrams_filter": {
    "type": "ngram",
        "min_gram": 3,
        "max_gram": 10

Here the minimum length for token to search with is required as of 3 to execute the search query successfully.

This is how we can use n-gram for token based searching.

Happy Blogging !!

Posted in Elasticsearch, Scala | Tagged , , , , | 1 Comment

Customized Response Time in Gatling Report

Gatling is a highly capable load testing tool which gives high performance. Some times we face problem in customization response time for our application.

In this Blog we will analyse how can we customize response time in Gatling. In Global Information Section


In graphical format,this Section Displays the Number of Requests captures in the given Load testing. Every requests are displays according to their Response times.

These Requests are divided into four different Sections:

  • In first section(green color), the requests whose response times are less than Lower bound
  • In second section(yellow color), the requests whose response times are are between the lower bound and the higher bounds
  • In third section (orange color),the requests whose response times are greater than the higher bound
  • In fourth section (red color),the requests who fail to respond

In Gatling test report by default Response times are:

  • Lower Bound is 800ms
  • Upper Bound is 1200ms

This time is editable, depend upon our project requirements. we can modify this time if we need.

For do this

  • Go to the directory where Gatling tool is downloaded
  • Inside the Gatling bundle directory look for the conf folder.
  • File named gatling.conf. Open this .conf in a text editor.

The charting section in the .conf file, example shown as below:

charting {
 #noReports = false # When set to true, don't generate HTML reports
 #maxPlotPerSeries = 1000 # Number of points per graph in Gatling reports
 #accuracy = 10 # Accuracy, in milliseconds, of the report's stats
 indicators {
 #lowerBound = 800 # Lower bound for the requests' response time to track in the reports and the console summary
 #higherBound = 1200 # Higher bound for the requests' response time to track in the reports and the console summary
 #percentile1 = 50 # Value for the 1st percentile to track in the reports, the console summary and GraphiteDataWriter
 #percentile2 = 75 # Value for the 2nd percentile to track in the reports, the console summary and GraphiteDataWriter
 #percentile3 = 95 # Value for the 3rd percentile to track in the reports, the console summary and GraphiteDataWriter
 #percentile4 = 99 # Value for the 4th percentile to track in the reports, the console summary and GraphiteDataWriter

The above code segment lists by default values as comes with Gatling bundle.  Inside that look for lowerBound and higherBound. these two are set as 800 and 1200 respectively. The unit is in milliseconds.

Suppose project requires to set these two as 1000 and 1800 respectively. Modify these two values and save the .conf file. and these two line of code by removing the #.

Like this:

charting {
 #noReports = false # When set to true, don't generate HTML reports
 #maxPlotPerSeries = 1000 # Number of points per graph in Gatling reports
 #accuracy = 10 # Accuracy, in milliseconds, of the report's stats
 indicators {
 lowerBound = 1000 # Lower bound for the requests' response time to track in the reports and the console summary
 higherBound = 1800 # Higher bound for the requests' response time to track in the reports and the console summary
 #percentile1 = 50 # Value for the 1st percentile to track in the reports, the console summary and GraphiteDataWriter
 #percentile2 = 75 # Value for the 2nd percentile to track in the reports, the console summary and GraphiteDataWriter
 #percentile3 = 95 # Value for the 3rd percentile to track in the reports, the console summary and GraphiteDataWriter
 #percentile4 = 99 # Value for the 4th percentile to track in the reports, the console summary and GraphiteDataWriter

After changes are saved Run the test case and observe the response times of the requests should be modified in the chart and also showing in information section.

So this is  how we can customized response time according to our project requirement

Posted in gatling, LoadTesting, Scala, Test, testing, tests | Tagged , , , , , | Leave a comment

How to build secure Web Application

We all use web applications everyday whether we consciously know it or not. That is, all of us who browse the web. Now a days we have seen a significant surge in the amount of web application specific vulnerabilities that are disclosed to the public. No web application technology has shown itself invulnerable, and discoveries are made every day that affect both owners and users security and privacy.

Security professionals have traditionally focused on network and operating system security. Assessment services have relied heavily on automated tools to help find holes in those layers,so we need some guideline to build secure web application apart from networking and operating system concept.

Software is generally created with functionality first in mind and with security as a distant second or third. This is an unfortunate reality in many development shops. Designing a web application is an exercise in designing a system that meets a business need and not an exercise in building a system that is just secure for the sake of it. However, the application design and development stage is the ideal time to determine security needs and build assurance into the application. Prevention is better than cure, after all!

In this blog we will learn some  guidelines to build secure web application. The following high-level security principles are useful as reference points when build web application.

  • Validate Input and Output                                                                                                      

    User input and output to and from the system is the route for malicious payloads into or out of the system. All user input and user output should be checked to ensure it is both appropriate and expected. The correct strategy for dealing with system input and output is to allow only explicitly defined characteristics and drop all other data. For the example, If an input field is for a Social Security Number, then any data that is not a string of nine digits is not valid.

  • Fail Securely (Closed)

    Any security mechanism should be designed in such a way that when it fails, it fails closed. That is to say, it should fail to a state that rejects all subsequent security requests rather than allows them. An example would be a user authentication system. If it is not able to process a request to authenticate a user or entity and the process crashes, further authentication requests should not return negative or null authentication criteria.

  • Keep it Simple

    While it is tempting to build elaborate and complex security controls, the reality is that if a security system is too complex for its user base, it will either not be used or users will try to find measures to bypass it. Often the most effective security is the simplest security. Do not expect users to enter 12 passwords and let the system ask for a random number password for instance! This message applies equally to tasks that an administrator must perform in order to secure an application. The application developer should create a security mechanism as simple as possible for users and perform complex security function at behind the scene.

  • Use and Reuse Trusted Components

    We should use third parties components from trusted source and it’s stable version, because in many cases they will have improved components through an iterative process and learned from common mistakes along the way. Using and reusing trusted components makes sense both from a resource stance and from a security stance. When someone else has proven they got it right, take advantage of it.

  • Defense in Depth

    Relying on one component to perform its function 100% of the time is unrealistic.While we hope to build software and hardware that works as planned, predicting the unexpected is difficult. Good systems don’t predict the unexpected, but plan for it. If one component fails to catch a security event, a second one should catch it.

  • Only as Secure as the Weakest Link

    We’ve all seen it, “This system is 100% secure, it uses 128bit SSL”. While it may be true that the data in transit from the user’s browser to the web server has appropriate security controls, more often than not the focus of security mechanisms is at the wrong place. As in the real world where there is no point in placing all of one’s locks on one’s front door to leave the back door swinging in its hinges, careful thought must be given to what one is securing. Attackers are lazy and will find the weakest point and attempt to exploit it.

  • Least Privilege

    Systems should be designed in such a way that they run with the least amount of system privilege they need to do their job. This is the “need to know” approach. If a user account doesn’t need root privileges to operate, don’t assign them in the anticipation they may need them. Giving the pool man an unlimited bank account to buy the chemicals for your pool while you’re on vacation is unlikely to be a positive experience.

  • Compartmentalization (Separation of Privileges)

    Similarly,compartmentalizing users, processes and data helps contain problems if they do occur. Compartmentalization is an important concept widely adopted in the information security realm. Imagine the same pool man scenario. Giving the pool man the keys to the house while you are away so he can get to the pool house, may not be a wise move. Granting him access only to the pool house limits the types of problems he could cause.

  • Security By Obscurity Won’t Work

    It’s naive to think that hiding things from prying eyes doesn’t buy some amount of time. Let’s face it, some of the biggest exploits unveiled in software have been obscured for years. But obscuring information is very different from protecting it.You are relying on the premise that no one will stumble onto your obfuscation, because hiding things does not mean that it is fully secure . This strategy doesn’t work in the long term and has no guarantee of working in the short term.

In Next blog we will see security checklist and its related security stuff.

You can find out more about web Application security here OWASP

Posted in Scala, Security, Security Audit, Security Checklist, Security Controls, Security Guidelines, web application | Leave a comment

Web Components, the Next Generation Web Development Markup

A lot of progress has been made since the introduction of the Web Components back in 2011. Basically Web Components are the sets of several separate technologies. You can think of Web Components as reusable UI (User Interface) widgets that are created using open Web technology. They are part of the browser, and so they do not need external libraries like jQuery, mootools and Dojo. Web Components is new and still-developing

There are bunch of polyfills to fulfill the compatibility issue of browser with web components as we know that Web Components are under-developing. So we can use Polymerwebcomponent.js, X-Tags and Bosonic. We are not discussing about these great tool here. I recommend you should see them by yourself.

Why Use Web Components

Web Components give developers an easier way to create Web/Mobile sites and recyclable widgets on these sites with the help of the HTML, CSS and JavaScript they already know. So that mean you don’t have to learn a new language for using Web Components

Four Powerfull tools of Web Components

1.  Shadow DOM

Shadow DOM refers to the ability of the browser to include a subtree of DOM elements into the rendering of a document, but not into the main document DOM tree. Consider a simple slider:

<input id="foo" type="range"/>

Put this code into any browser, and see the magic:

Shadow DOM is sounds like nice feature and it is. Basically in web documents, there is only one DOM. Think about DOM hosting DOM, which hosts more DOM. You’ll see something like this in Chrome inspector (note #shadow-root, which is completely encapsulated DOM)

▾#shadow-root (user-agent)
<div class="profile">
<img src="" class="profile-img">
<div class="profile-name"></div>
<div class="profile-social"></div>

2.  Templates

Insert bunch of clone-able Markups. Can be activated for later use

3.  Custom Elements

Create new HTML elements-expand HTML’s existing vocabulary. With the Custom Element we can define our own element. This can be anything, But your elements must have a dash/hyphen, to avoid any naming clashes.

<div class="profile">
<img src="" class="profile-img">
<div class="profile-name"></div>
<div class="profile-social"></div>

4.  HTML Imports

Import the chunk of HTML code with import tag. Importing files into our pages comes in many shapes. For CSS, we have @import, for JavaScript in ES6 modules we have import {Module} from ‘./PATH’;, and finally, HTML. We can import HTML components at the top of our document to define which ones we need to use in our app

<link rel="import" href="your-element.html">

<!-- <your-element> is now available-->
<link rel="import" href="google-map.html">


You can find out more about Web components here.

Posted in AngularJS, Bootstrap, CSS, HTML, JavaScript | Tagged , , , , , | 2 Comments

(Code Dissection) Akka Quartz Scheduler Scala’s way of scheduling(Part -2)

I hope you guys are doing good, and had a fresh breath. Put your mask again if you find the previous topic smelly, as we are going to finish up the dissection for Akka Quartz Scheduler. I am going to refer the every Quartz-Scheduler things prefixed with java. So when I say Java-Quartz-Scheduler, I mean the the quartz library which is made through java, and which is wrapped by the Akka Quartz Scheduler. And one more thing to noticed is that QuartzSchedules and QuartzSchedule are different.

Alright, so let’s start then. First let’s revisit some of the parts of Akka-quartz which will help us to do the dissection part easily. If you check the repo, you will find that the class that we interact with is the QuartzSchedulerExtension. First we initialize it and then schedule it. When we initializes it, we have some states(immutable) for itself, then the schedule function is called. So let’s revisit this two parts. It initializes like this –

1. config – gets the config from the configuration file for the key “akka.quartz”
2. defaultConfig – hard coded strings parsed for configuration
3. threadCount – reads from the config(point 1) as Int for the key “threadPool.threadCount”
4. threadPriority – reads from the config(point 1) as Int for the key “threadPool.daemonThreads”
5. daemonThreads_? – reads from the config(point 1) as boolean for the key “threadPool.daemonThreads”
6. defaultTimeZone – reads from the config(point 1) as Timezone for the key “defaultTimeZone”
7. schedules – builds through the QuartzSchedules by passing config(point 1) and defaultTimezone(point 6) which returns a immutable map of string and QuartzSchedule.
8. runningJobs – just a declaration of mutable map with String and JobKey.

Lazy Declarations

9. threadPool – a new SimpleThreadPool with the help of threadCount(point 3), threadPriority(point 4) and daemonThreads_?(point 5).

10. jobStore – a new RAMJobStore

11. scheduler – here a lot of stuffs happen. It loads the java-Quartz-Scheduler using scheduler name, system name(from the Akka system passed to the QuartzSchedulerExtension) , threadPool(Point 9) and jobStore(point 10). Then it creates a scheduler from the java-Quartz’s DirectSchedulerFactory. Using this scheduler object, the schedule jobs are being shutted down by calling shutdown function, which is hooked into registerOnTermination of Akka system. Then finally the scheduler is being returned.

After this declarations comes the schedule function of QuartzScheduleExtension, where it receives the schedule name, actor reference, the message which will be needed to send to the actor and the QuartzSchedule object fetches from the schedules(point 7). In this function for the right case it sends the parameters to scheduleJob function. In the scheduleJob function it does the following tasks.

F1. creates a jobDataMap having key value types as String and AnyRef for logBus, receiver which is the ActoRef passed to it and the message which is again passed to it.

F2. creates a java-Quartz JobDataMap object through java-Quartz JobDataMapSupport object by converting scala jobDataMap to java jobDataMap and stored it in val called jobData.

F3. creates a java-Quartz Job through the jobData(point F2), SimpleActorMessageJob and description from schedule(QuartzSchedule) and stored it in a val called job

F4. add a key value pair in the runningJobs(point 8) of key name(from the parameter) and key from the job(point F3).

F5. creates a trigger from the schedule(QuartzSchedule object passed to the function) by calling buildTrigger.

F6. schedule the the job using the job(point F3) and trigger(point F5) using the scheduler(point 11)

You must have noticed by now that we have these terms which we need to dissect
i. QuartzSchedule
ii. QuartzSchedules
iii. SimpleActorMessageJob

And of course the java terms which needs that came up like JobDataMapSupport, Job, Trigger. Well let’s first see how java quartz scheduler works and don’t worry we’re not going deep down into java quartz library. We need this stuffs to understand java Quartz scheduler – The Scheduler, Job, jobDataMap and trigger.

java-quartz-shceduler (1)

In short a Scheduler schedules a job with the help of jobMapData and run when it’s being triggered. Job has a method called execute which takes the JobExecutionContext from where we can fetch the jobDataMap. So now the question is how Akka-Quartz-Scheduler wraps them? Well if you go through the points F1 to F6 you would have already know, how it works. At point F2 it creates the jobDataMap, SimpleActorMessageJob is the job, at F5 it creates the Trigger through the scheduler and at F6 . So the answer is Job is wrapped by SimpleActorMessageJob, Trigger is wrapped by QuartzSchedule and jobDataMap is created by java- JobDataMapSupport from the values jobData at (point F2). By now I hope you have fair idea how the scheduler(point 11) and jobDataMap (point F2) are being created. So what we have dissect now is the Job that is SimpleActorMessageJob and the Trigger i.e. QuartzSchedule.

Let’s first go through SimpleActorMessageJob as it is shorter than QuartzSchdule to explain :P. SimpleActionMessageJob extends java-Quartz’s Job, and when you extends a Job you have to override the execute function. The execute function takes JobExecutionContext as parameter and through this JobExecutionContext we can fetch the jobDataMap which in our case is sent from the QuartzSchedulerExtentsion (point F2). So basically this class has three functions as, getAs and execute where getAs is not being used anywhere :\. However as function is used inside the execute function to convert the objects get from jobDataMap. So the only question is what happens inside the execute method. And the answer is simple it gets the actorRef from the jobDataMap through the key receiver, the message through the key message the logBus to logs the stuffs through the key logBus(point F1). and then it just use the Akka’s tell function to send the message to the actor receiver ! msg.

Okay, so next thing left is QuartzSchedule. Well when we find the QuartzSchedule we find two more things QuartzSchedules(which we have named earlier) and QuartzCronSchedule. Well QuartzSchedule is sealed trait in the QuartzSchedules.scala file. QuartzCronSchedule implements it and QuartzSchedules creates the instances of QuartzCronSchedule.

Well it goes like this. If you see point 7 in the declaration, you will find that it sends config and defaultTimezone to QuartzSchedules to get a map of String and QuartzSchedule. Below are the tasks done through the QuartzSchedules.


QS1. In the apply function get the ConfigObject for “schedules”, and pass it to parseSchedule with name. Here the name is the key.

QS2. In parseSchedule fetch the timeZone using the key timeZone, calendar, description and sends it to parseCronSchedule.

QS3. In the parseCronSchedule it fetch the cron expression for the key expression, creates QuartzCronSchedule and returns.

The QuartzCronSchedule extends the QuartzSchedule and takes parameters name, description, expression timezone and calendar(points QS1 to QS3). it declares an object called schedule which creates a CronScheduleBuilder using expression and timezone. Where in QuartzSchedule there are declaration for name, description, schedule, calendar and an implemented function buildTrigger, which creates a java-Quartz’s TriggerBuilder using name, description and schedule. And here schedule is override in QuartzCronSchedule for CronScheduleBuilder.So if you revisit points F1 to F6 you will find that all these QuartzSchedule and QuartzCronSchedule are being called there. It would be wrong to say that Akka-Quartz-Scheduler works like the below diagram, but just to make similarity with java-Quartz it looks like this.

AkkaQuartzFlow (1).png

Now you may unmask and tell me if you like it ;).

Posted in Akka, Java, knoldus, Scala | 5 Comments