Getting Started Cockroach with Scala: An Introduction

Today we are going to discuss that how can we use the Scala with the Cockroach DB? As we all know that Cockroach DB is a distributed SQL database built on top of a transactional and consistent key-value store and now we are going to use it with the Scala. But before starting the journey, To those who have caught the train late,ūüėČ this is what has happened till now:

  1. An Introduction to CockroachDB !!

Now before starting the code please setup Cockroach DB on your local environment.

We have to follow these steps for setting up CockroachDB in your local environment.

  • We can download the Cockroach DB from here and follow the instruction which mentions there.
  • Now run the following commands for starting the Nodes:

Continue reading

Posted in Akka, akka-http, database, Scala, Slick | Tagged | 1 Comment

Self-Learning Kafka Streams with Scala – #1

A few days ago, I came across a situation where I wanted to do a stateful operation on the streaming data. So, I started finding possible solutions for it. I came across many solutions which were using different technologies like Spark Structured Streaming, Apache Flink, Kafka Streams, etc.

All the solutions solved my problem, but I selected Kafka Streams because it met most of my requirements. After that, I started reading its documentation and trying to run its examples. But, as soon as I started learning it, I hit a major roadblock, that was, “Kafka Streams does not provide a Scala API!“. I was shocked to know that.

The reason I was expecting Kafka Streams to have a Scala API was that I am using Scala to build my application and if Kafka Streams provided an API for it then it would have been easy for me to include it in my application. But that didn’t turn out to be the case. Over the top when I searched for its Scala examples, I was able to find only a handful of them.

Continue reading

Posted in Apache Kafka, Scala, Streaming | Tagged | Leave a comment

Kafka Streams : More Than Just a Dumb Storage

Whenever we hear the word Kafka, all we think about it as a messaging system with a publisher-subscriber model that we use for our streaming applications as a source and a sink.

So we can say that Kafka is just a dumb storage system that stores the data provided by a producer for a long time (configurable) and it can provide it to some consumer whenever one asks for data (from a topic of course).

Now between consuming the data from producer and sending it to the consumer, we can’t do anything on this data in Kafka. Then, we make use of other tools like Spark or Storm to process the data in between producer and consumer. In this way we have to build two separate clusters for our app: one for our Kafka cluster that stores our data; another one is to do stream processing on our data.

So to save ourselves from this hassle, Kafka Streams API comes to our rescue. With this,  we have a Unified Kafka where we can set our stream processing inside Kafka cluster. And with this tight integration, we get all the support from Kafka (for example topic partition becomes stream partition for parallel processing).


The Kafka Streams API allows you to create real-time applications that power your core business. It is the easiest yet the most powerful technology to process data stored in Kafka. It gives us the implementation of standard classes of Kafka.

A unique feature of the Kafka Streams API is that the applications you build with it are normal applications. These applications can be packaged, deployed, and monitored like any other application ‚Äď there is no need to install separate processing clusters or similar special-purpose and expensive infrastructure!


Link to the image

Continue reading

Posted in Apache Kafka, Scala | Tagged , , , | Leave a comment

RealTimeProcessing of Data using kafka and Spark

Before Starting it you should know about kafka, spark and what is Real time processing of let’s do some brief introduction about it.

Real Time¬†Processing¬†–¬†Processing the Data that appears to take place instead of storing the data and then processing it or processing the data that stored somewhere else.

Kafka – Kafka is the maximum throughput of data from one end to another . it uses a concept of producer and consumer for producing and consuming the data. producer sends the data into topics that’s put on Kafka cluster and consumer subscribes the data from these topics. you can read about more Kafka here

Spark – spark is an open source processing engine built around speed, ease of use, and analytic. If you have large amounts of data that requires low latency processing that then Spark is the way to go. you can read about spark here

Spark Streaming –¬†Spark Streaming is an extension of the core Spark API that enables processing of live data streams. Spark Streaming provides a high-level abstraction called discretized stream or DStream, which represents a continuous stream of data.

Continue reading

Posted in Scala | Tagged , , , , | 1 Comment

AMPS: Empowering real time message driven applications.


In this blog, we will talk about AMPS, a pub-sub engine which delivers messages in real time with a subject of interest. AMPS is mainly used by Financial Institutions as enterprise message bus. We will also demonstrate how we can use AMPS with to publish and subscribe messages with an example. So, let’s start with introducing AMPS.¬†

What is AMPS?

Advanced Message Processing System (AMPS) is a publish and subscribe engine developed by 60East technologies. It  is highly scalable and allow publishing and subscribing messages in real time. It is equipped with in built support of multiple messaging protocols such as FIX, NVFIX, JSON, XML which are mainly used in financial services such as trade processing. It empowers applications to deliver messages in real time with flexible topic and content based routing options. 

AMPS Flow (1)

How does it work?

The above diagram describes how messaging looks like in AMPS. AMPS provides a unique way message delivery by TOPICs. A topic is nothing but a subject of interest which is assigned to messages. For example A cricket score update server would publish a message with topic “Sixer” whenever a Six is hit in the game of Cricket. This message would be deliver to each subscriber having subscribed the topic ‚ÄúSIX‚ÄĚ in the real time. AMPS also allow content based subscriptions where a subscriber would be interested in a particular content in the messages. For example In the diagram, Subscriber 2 is interested in Content ‚ÄúA‚ÄĚ. Here ‚ÄúA‚ÄĚ is a player who has hit the SIX. Whenever something is talked about the particular player, subscriber 2 will receive messages.

Writing Pub/Sub clients for AMPS.

Writing clients for publishing and subscribing to AMPS messages is super simple. Create a Client with a name and use connect method to connect to the AMPS server. Look at the following publisher example.

Publishing to a Topic:

import com.crankuptheamps.client.{Client, MessageStream}

   val client_name = "Publisher"
   val server_url = "tcp://"
   val topic = "Sixer"
   val publisher = new Client(client_name)
   publisher.publish(topic, s"""{ "message" : "A has hit a massive SIX!" }""")

There is no restriction on a topic name. It can be any text but it is recommended that a topic name should be a text without character which are used in reg-ex. The reason is that a topic can be matched agains re-gex.

Continue reading

Posted in Messages, MessagesAPI, Scala | 1 Comment

What The Heck Is Python??


Python is an easy to learn, powerful programming language and¬†object-oriented programming language created by Guido van Rossum.¬†It wasn’t named after a dangerous snake ūüėõ . Rossum was the fan of a comedy series from the late seventies. The name “Python” was adopted from the same series “Monty Python’s Flying Circus”.

Everything in Python is an object. Sites like Mozilla, Reddit and Instagram are written in Python.

3 Reasons why to Choose Python as First Language

  • Simple Elegant Syntax – It is easier to understand and write python code.

Continue reading

Posted in Scala | Tagged , , , , , | 4 Comments

Getting Started with Ansible

Introduction: Ansible is a configuration management and provisioning tool, similar to Chef, Puppet or Salt.Configuration management systems are designed for controlling large numbers of servers easy for administrators and operations teams. They allow you to control many different systems in an automated way from one central location.

There are many popular configuration management systems available for Linux systems, such as Chef and Puppet, these are often more complex than many people want or need. Ansible is written in Python and uses SSH to execute commands on different machines. Ansible uses YML to describe work.

Install And Configure Ansible on Ubuntu: Run the below command to install and configure ansible on Ubuntu.

sudo yum install ansible

We’ll assume you are using SSH keys for authentication. To set up SSH agent to avoid retyping passwords, you can run the below command.

ssh-agent bash
ssh-add ~/.ssh/id_rsa

Configuring Ansible Hosts: Ansible keeps track of all of the servers that it knows about through a “hosts” file. We need to set up this file first before we can begin to communicate with our other computers.Open the file with root privileges like this:

sudo vi /etc/ansible/hosts

Continue reading

Posted in Scala | 1 Comment

Like Java 7 ? Then You Are Going to Love Java 8 !!

JAVA 8 (aka jdk 1.8) is a major release of JAVA programming language development. With the Java 8 release, Java provided support for functional programming, new JavaScript engine, new APIs for date time manipulation, new streaming API, etc. which will be discussed in detail.
In this blog, we will focus on What’s New in Java 8 and it’s usage in a simple and intuitive way.We assume that you are already familiar with Java 7.

If you want to run programs in Java 8, you will have to setup Java 8 environment by following steps :

  1. Download JDK8 and install it. Installation is simple like other java versions. JDK installation is required to write, compile and run the program in Java.
  2. Download latest Eclipse IDE/IntelliJ , these provide support for java 8 now. Make sure your projects build path is using Java 8 library.


New Features

There are dozens of features added to Java 8, the most significant ones are mentioned below . Let’s Begin Discussing Each in Detail :

Continue reading

Posted in Functional Programming, Java | Tagged , , , , , , | 2 Comments

Internationalization with Play Framework(2.6.x)

In this blog, I will demonstrate how your application can support different languages using Play Framework 2.6.0 
What is Application/Website Internationalization ?
Application/Website Internationalization can be defined as a process of developing and designing an application that supports not only single language but also different languages so that it can be easily adapted by the users from any language, region, or geography. It ensures that the code base of your application is flexible enough to serve a new audience without rewriting the complete code or keeps text separate from the code base. 
Let us start the implementation step by step:
1. Specifying Languages for your application  
In order to specify Languages for your application, you need¬†Language tags, which are specially formatted strings that indicate specific languages such as “en” for English,¬†“fr” for French, a specific regional dialect of a language such as “en-AU” for English as used in Australia.
First, you need to specify the languages in the conf/application.conf file, Languages tags will be used to create play.api.i18n.Lang instances.
¬† ¬† ¬† ¬† ¬†play.i18n.langs = [“en”,¬†“fr”]

Continue reading

Posted in github, HTML, Internationalization, Play Framework, Scala | Tagged , , , , | 2 Comments

How To Use Vectorized Reader In Hive

Reason For Writing This Blog is That  I tried to use Vectorized Reader In Hive But Faced some problem with its documentation,thats why decided to write this block


Vectorized query execution is a Hive feature that greatly reduces the CPU usage for typical query operations like scans, filters, aggregates, and joins. A standard query execution system processes one row at a time. This involves long code paths and significant metadata interpretation in the inner loop of execution. Vectorized query execution streamlines operations by processing a block of 1024 rows at a time. Within the block, each column is stored as a vector (an array of a primitive data type). Simple operations like arithmetic and comparisons are done by quickly iterating through the vectors in a tight loop, with no or very few function calls or conditional branches inside the loop

Enabling vectorized execution

To use vectorized query execution, you must store your data in ORC format Plus

set hive.vectorized.execution.enabled = true ;

How To Query

To use vectorized query execution, you must store your data in ORC format,

just follow the below steps

    • start hive cli and create orc table with some data
hive> create table vectortable(id int) stored as orc;
Time taken: 0.487 seconds
hive>set hive.vectorized.execution.enabled = true;

hive> insert into vectortable values(1);

Query ID = hduser_20170713203731_09db3954-246b-4b23-8d34-1d9d7b62965c

Total jobs = 1 Launching Job 1 out of 1 Number of reduce tasks is set to 0 since there's no reduce operator Job running in-process (local Hadoop)

2017-07-13 20:37:33,237 Stage-1 map = 100%, reduce = 0% Ended Job = job_local722393542_0002

Stage-4 is selected by condition resolver.

Stage-3 is filtered out by condition resolver.

Stage-5 is filtered out by condition resolver. Moving data to: hdfs://localhost:54310/user/hive/warehouse/vectortable/.hive-staging_hive_2017-07-13_20-37-31_172_3262390557269287245-1/-ext-10000

Loading data to table default.vectortable Table default.vectortable stats: [numFiles=1, numRows=1, totalSize=199, rawDataSize=4]

MapReduce Jobs Launched: Stage-Stage-1: HDFS Read: 321 HDFS Write: 545 SUCCESS Total MapReduce CPU Time Spent: 0 msec OK Time taken: 2.672 seconds

Continue reading

Posted in Scala | Leave a comment