Getting started with TensorFlow: A Brief Introduction


TensorFlow is an open source software library, provided by Google, mainly for deep learning, machine learning and numerical computation using data flow graphs.

Looking at their website, the first definition they have written for TensorFlow goes something like this –

TensorFlow™ is an open source software library for numerical computation using data flow graphs. Nodes in the graph represent mathematical operations, while the graph edges represent the multidimensional data arrays (tensors) communicated between them. The flexible architecture allows you to deploy computation to one or more CPUs or GPUs in a desktop, server, or mobile device with a single API.

At first look, it may look very confusing to the reader. But don’t worry, just keep on reading further and I promise you at the end of the blog you will understand each and every word of this definition.

Continue reading

Advertisements
Posted in machine learning | Tagged , , | Leave a comment

What’s new in Neo4j 3.3.0


Neo4j released a new version of Neo4j i.e. Neo4j 3.3.0. There are the lot of improvement and new features along with new UI. You can download the webImage from here.

Some of the changes in the new version of Neo4j are:

  • Cypher and Write Performance & Scalability
  • Data Import, Integration & ETL
  • Upgrades for Graph Analytics
  • Security & Operations
  • Neo4j Desktop

Cypher and Write Performance & Scalability

In the latest release of Neo4j, Write performance increase by 50% compares to the previous version of the Neo4j. That means now we can write more data in less time. It replaces Lucene based schema with Transactional writes benefit from new native indexes. Bulk writes get boost 40% by reducing the memory footprint and leveraging virtual memory in RAM-constrained environments.

Cypher performance also increased by 70%-70% as per the internal testing. Continue reading

Posted in Scala | Leave a comment

Scala Coding Style Guide:- InShort


We all are using the Scala for a very long time now and sometimes we miss some guidelines for writing the Scala code, well this blog guide you through some common Scala coding styles. Lets get started.

  • Indentation:- Scala follows 2 space indentation instead of 4 spaces, well I guess there will be no fight over Tab and 4 Spaces.
    //WRONG                                //RIGHT              
    class Foo {                            class Foo {          
        def bar = ...                        def bar = ...      
    }                                      }                    
  • Line Wrapping:- There are times when a single expression reaches a length where it becomes unreadable to keep it confined to a single line. Scala coding style prefers if length of a line crosses 80 characters then, split the same in multiple lines i.e.
    val result = 1 + 2 + 3 + 4 + 5 + 6 +
      7 + 8 + 9 + 10 + 11 + 12 + 13 + 14 +
      15 + 16 + 17 + 18 + 19 + 20
  • Methods with Numerous Arguments:- If a function has long or complex parameter lists, follow these rules:
    Put the first parameter on the same line as the function name.
    Put the rest of the parameters each on a new line, aligned with the first parameter.
    If the function has multiple parameter lists, align the opening parenthesis with the previous one and align parameters the same as #2. i.e.
    Continue reading
Posted in Scala | Tagged , , , , | 1 Comment

Let’s dive deep into Apache Ignite


Let’s first have some basic understanding of Apache Ignite and then we will look more into its life cycle with the help of a demo application.

Apache Ignite is an in-memory computing platform that is durable strongly consistent and highly available with powerful SQL, key value and processing APIs.

Features :

  • Durable Memory : Apache Ignite is based on the Durable Memory architecture that allows storing and processing data and indexes both in memory and on disk when the Ignite Native Persistent feature is enabled.
  • Ignite Persistence : Ignite native persistence is a distributed, ACID, and SQL-compliant disk store that transparently integrates with Ignite’s durable memory. Ignite persistence is optional and can be turned on and off. When turned off Ignite becomes a pure in-memory store.With the the native persistence enabled, Ignite always stores a superset of data on disk, and as much as possible in RAM.
  • ACID Compliance : Data stored in Ignite is ACID-compliant both in memory and on disk, making Ignite a strongly consistent system. Ignite transactions work across the network and can span multiple servers. This makes Ignite stand out from the eventually consistent NoSQL systems that hardly support any type of transactions.

Continue reading

Posted in Functional Programming, Java | Tagged , , , , , , , , , , | Leave a comment

DATA PERSISTENCE IN LAGOM


Are you finding it difficult to understand lagom persistence? Don’t worry because help is right here.
In this blog, we will learn about lagom’s persistence with the help of a simple application and also discuss its theoretical aspects.
Before we begin, make sure you know about Event Sourcing and CQRS. You can read about it in details from this link .

Choosing a database

When we create any microservice, or in general any service, one of the biggest task is to manage data persistence. Lagom supports various databases for doing this task. By default, Lagom uses Cassandra to persist data. Tables, required to store data, are saved in cassandra keyspaces.
So, For now, we will be using Cassandra for storing our data. Our service basically creates a user on request and store the correspondng details in the database.

To use Cassandra, you need to add the following in your project’s build.sbt:

libraryDependencies += lagomScaladslPersistenceCassandra

Lagom requires keyspace configuration for three internal components – Journal, snapshot and offset.
Journal stores serialized events, Snapshots are automatically saved after a configured number of persisted events for faster recovery and Offset store provides read-side support. Continue reading

Posted in Cassandra, database, Microservices, NoSql, Scala | Tagged , , , , , , , , | 3 Comments

Zeppelin with Spark


Let us first start with the very first question, What is Zeppelin?

It is a web-based notebook that enables interactive data analytics. Based on the concept of an interpreter that can be bound to any language or data processing backend, Zeppelin is a web-based notebook server.

This notebook is where you can do your data analysis. It is a Web UI REPL with pluggable interpreters including Scala, Python, Angular, SparkSQL etc. You can make beautiful data-driven, interactive and collaborative documents with SQL, Scala and more.

In this blog, I will be discussing about using Zeppelin with Spark.

Continue reading

Posted in big data, Scala, Spark, Tutorial | Leave a comment

Apache Storm: Architecture


Apache Storm is a distributed realtime computation system. Similar to how Hadoop provides a set of general primitives for doing batch processing, Storm provides a set of general primitives for doing the realtime computation. Storm is simple, can be used with any programming language, is used by many companies, and is a lot of fun to use!

Components of a Storm cluster

Apache Storm cluster is superficially similar to a Hadoop cluster. Whereas on Hadoop you run “MapReduce jobs”, on Storm you run “topologies”. “Jobs” and “topologies” themselves are very different — one key difference is that a MapReduce job eventually finishes, whereas a topology processes messages forever (or until you kill it).

There are two kinds of nodes in a Storm cluster:

  • Master node (Nimbus)

The master node runs a daemon called “Nimbus” that is similar to Hadoop’s “JobTracker”. Nimbus is responsible for distributing code around the cluster, assigning tasks to machines, and monitoring for failures.

Continue reading

Posted in big data, Clojure, Scala, Streaming | 1 Comment

PJAX: Loading your website faster!


We all hate waiting for websites to load before we can start using or surfing it. Internet has a come a long way with great speeds to decrease this time, but this has just made the user more impatient. After a lot of research through the decade, it has been found out that on an average a user would not wait for more than 4 seconds for a web page to load. With such short time requirements, even decrease of milliseconds in the loading time can effect the web-site’s business greatly. While on client-side, improving Internet connection is obviously a way to load websites faster, even web developers can develop websites with numerous techniques that will help their site to load faster. We are here to talk about one such technique, Pjax.

Pjax is made up of two terms, pushState and ajax. Before diving into pjax, let’s have a brief look at what ajax and pushState are.

Continue reading

Posted in AJAX, JavaScript | 1 Comment

Creating GraphQl API with Sangria


Sangria is a library that processes the graphQL queries coming to the server and pass on the object to business layer of the application, which mostly further queries database to get information which sangria passes on as response to the query. GraphQL is a query language for the servers with which only one route can be used to give response to any query, which are defined by our GraphQL schema, if you’re more interested in knowing about the GraphQL then you can go to graphql.org/learn, it’s really good documentation that explains GraphQL very well.

sangria-arch.png

Sangria processes the GraphQL queries and sends back the asked data.

For the explanation, let’s assume our only route is /graphql. And we are going to receive our requests on that route. All the requests, either it’s to get some data, update, create or delete the data everything is send to server in POST data. The sangria query should have three fields in the post data.

{
    "query" : "",
    "variables" : "",
    "operations" : ""
}

Continue reading

Posted in Scala | Tagged , , , | Leave a comment

Are you using Scala Collection efficiently?


In this blog, We will be going to explore how we can use scala collections efficiently .

Though, we are taking care of immutability but still something more can be done to make your code more readable and faster.

List vs Vector:

Vector is a collection with good random access. List is indeed a linked list with very fast append-first operation (::), but the links are unidirectional.

Scala Vector has a very effective iteration implementation and, comparing to List is faster in traversing linearly. if you are planning to do quite some appending, Vector is a better choice and List don’t work well with parallel algorithms.

Most efficient way to create ListBuffer:

Order by efficient:

val list = new mutable.ListBuffer[String]
val list = mutable.ListBuffer.empty[String]
val list = mutable.ListBuffer[String]()

First approach creates only one object ListBuffer itself but other two approaches create an instanceof scala.collection.mutable.AddingBuilder first, which is then asked for a new instance of ListBuffer. Therefore, first approach is very efficient.

Prefer TrieMap over any other mutable Map:

// before
val mutableMap = collection.mutable.Map[T,T]()

// after
val immutableMap = collection.concurrent.TrieMap[T,T]()

If there is need to use mutable Map in multi-threading environment, use TrieMap. A Scala TrieMap is a trie-based concurrent scalable and immutable map implementation. Unlike normal trie maps, a Scala TrieMap has an efficient, non-blocking, O(1) time snapshot operation (and a slightly optimized readOnlySnapshot) operation.

Creating empty collections explicitly:

List[T]()   //before

List.empty[T]  //after

There are some immutable collection classes provide singleton “empty” implementations, however not all of the factory methods check length of the created collections. Thus, by making collection emptiness apparent at compile time, we could save either heap space (by reusing empty collection instances) or CPU cycles (otherwise wasted on runtime length checks).

Never negate emptiness-related properties:

//before
!list.isEmpty
!list.nonEmpty

//after
list.nonEmpty
list.isEmpty

Simple built-in methods add less visual clutter than a compound expression. It is also applicable to Set, Option,Map, Iterator

Never compute length for emptiness check:

// before
list.length > 0
list.length != 0
list.length == 0

// after
list.nonEmpty
list.nonEmpty
list.isEmpty

Simple property is more easy to perceive than compound expression. And collections are decedents of LinearSeq which will take O(n) time to compute length instead of O(1) time for IndexedSeq. Therefore, we can speed up our code by avoiding unnecessary computations when exact value is not required. Besides that, we can never use length method on infinite streams but we can verify stream emptiness directly.

Never compute full length of collection for length matching:

// before
list.length > n
list.length < n list.length == n
list.length != n

// after
list.lengthCompare(n) > 0
list.lengthCompare(n) < 0
list.lengthCompare(n) == 0
list.lengthCompare(n) != 0

Because computing length might be very  “costly” for some collection classes. We can reduce comparisons time from O(length) to O(length min n) for decedents of LinearSeq. This approach won’t work with infinite streams.

Prefer length to size for arrays:

// before
array.size

// after
array.length

While size and length are synonyms, in scala 2.11 Array.size calls are still implemented via implicit conversion, so that intermediate wrapper objects are created for every method call. Unless you enable escape analysis in JVM , those temporary objects will burden GC and can potentially degrade code performance (especially, within loops).

Don’t retrieve first element by index:

// before
list(0)

// after
list.head

The second approach might be slightly faster for some collection classes. Additionally, property access is much simpler(both syntactically and semantically) than calling a method with an argument.

Don’t retrieve last element by index:

// before
list(list.length-1)

// after
list.last

The second approach is more obvious and allows to avoid unnecessary length computation. Besides, some collection classes can retrieve last element more efficiently comparing to by-index access.

Don’t check index bound explicitly:

// before
if (i < list.length) Some(list(i)) else None

// after
list.lift(i)

The second expression is more concise and semantically equivalent.

Don’t imitate headOption:

// before
if (list.nonEmpty) Some(list.head) else None
list.lift(0)

// after
list.headOption

The optimized expression is more concise.

Don’t imitate lastOption :

// before
if (list.nonEmpty) Some(list.last) else None
list.lift(list.length – 1)

// after
list.lastOption

The optimized expression is more concise and faster.

Be careful with indexOf and lastIndexOf argument types:

// before
List(1, 2, 3,4,5).indexOf(“1”) // compilable
List(1, 2, 3,4,5).lastIndexOf(“2”) // compilable

// after
List(1, 2, 3,4,5).indexOf(1)
List(1, 2, 3,4,5).lastIndexOf(2)

indexOf and lastIndexOf accepts arguments of Any type. In practice, that might lead to hard-to-find bugs, which are not discoverable at compile time. That’s where auxiliary IDE inspections come in handy.

Don’t use equality predicate to check element presence:

// before
list.exists(_ == x)

// after
list.contains(x)

The second expression is more concise and semantically equivalent.

Be careful with contains argument type:

// before
List(1, 2, 3,4,5).contains(“1”) // compilable

// after
List(1, 2, 3,4,5).contains(1)

Just like indexOf and lastIndexOf method, contains also accepts argument of type Any, that might lead to hard-to-find bugs, which are not discoverable at compile time. Be careful with the method arguments. Continue reading

Posted in Functional Programming, Scala, scaladays | Tagged , , , | 2 Comments