programming

Getting Lazy With Scala

Reading Time: 4 minutes In this blog, we will talk about lazy evaluation in Scala. How we can add efficiency to our application? Efficiency is achieved not just by running things faster, but by avoiding things that shouldn’t be done in the first place. In functional programming, lazy evaluation means efficiency.  Laziness lets us separate the description of an expression from the evaluation of that expression. This gives us Continue Reading

Knolx: The Hidden Mystery Behind Scala Functional Programming

Reading Time: < 1 minute Hello everyone, Knoldus organized a session on 25th January 2018. The topic was “The Hidden Mystery Behind Scala Functional Programming”. Many people attended and enjoyed the session. In this blog post, I am going to share the slides & video of the session. Slides:

Exploring JEST: Java HTTP REST Client

Reading Time: 2 minutes Elasticsearch is a real-time distributed and open source full-text search and analytics engine. To integrate Elasticsearch to our application, we need to use an API. Elasticsearch gives us two ways, REST APIs, and Native clients. It’s easy to get confused about all the different ways to connect to Elasticsearch and why one of them should be preferred over the other. Available Elasticsearch clients are: Node Continue Reading

Java High-Level REST Client – Elasticsearch

Reading Time: 3 minutes Elasticsearch is an open-source, highly scalable full-text search and analytics engine. Using this, you can easily store, search, and analyze a large amount of data in real time. Java REST client is the official client for Elasticsearch which comes in 2 flavors: Java Low-Level REST client – It allows communicating with an Elasticsearch cluster through HTTP and leaves requests marshalling & responses un-marshalling to users. Continue Reading

Let’s dive deep into Apache Ignite

Reading Time: 4 minutes Let’s first have some basic understanding of Apache Ignite and then we will look more into its life cycle with the help of a demo application. Apache Ignite is an in-memory computing platform that is durable strongly consistent and highly available with powerful SQL, key value and processing APIs. Features : Durable Memory : Apache Ignite is based on the Durable Memory architecture that allows Continue Reading

Introduction to GraphQL – A Query Language for APIs

Reading Time: 6 minutes GraphQL is an API standard that provides a more efficient, powerful and flexible alternative to REST. It was created by Facebook in 2012 and was open-sourced in 2015 and is now maintained by a large community of companies and individuals from all over the world. GraphQL is just a specification, meaning it’s just a set of rules defined in a document. GraphQL is a query language designed Continue Reading

Having Issue How To Order Streamed Dataframe ?

Reading Time: 3 minutes A few days ago, i have to perform aggregation on streaming dataframe. And the moment, i apply groupBy for aggregation, data gets shuffled. Now the situation arises how to maintain order? Yes, i can use orderBy with streaming dataframe using Spark Structured Streaming, but only in complete mode. There is no way of doing ordering of streaming data in append mode and update mode. I Continue Reading

Scala Best Practices

Reading Time: 6 minutes The central drive behind Scala is to make life easier and more productive for the developer — and that includes me. Scala does this with three principal techniques: It cuts down on boilerplate, so programmers can concentrate on the logic of their problems. It adds expressiveness, by tightly fusing object-oriented and functional programming concepts in one language. And it protects existing investments by running on Continue Reading

Simple Java program to Append to a file in Hdfs

Reading Time: 2 minutes In this blog, I will present you with a java program to append to a file in HDFS. I will be using Maven as the build tool. Now to start with- First, we need to add maven dependencies in pom.xml. This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor Continue Reading

R-The Statistical Programming Language

Reading Time: 5 minutes R is a powerful language used widely for data analysis and statistical computing. It was developed in the early 90s. It is one of the most popular languages used by statisticians, data analysts, researchers, and marketers to retrieve, clean, analyze, visualize and present data. It is open source and free. It supports cross-platform interoperability i.e, R code written on one platform can easily be ported Continue Reading

BigData Specifications – Part 1 : Configuring MySql Metastore in Apache Hive

Reading Time: 2 minutes Apache Hive is used as a data warehouse over Hadoop to provide users a way to load, analyze and query the data from various resources. Data is stored into databases or file systems like HDFS (Hadoop Distributed File System). Hive can use Spark SQL or HiveQL for the implementation of queries. Now Hive uses its metastore which contains the following information, Ids of tables, Ids Continue Reading

Effective Programming In Scala – Part 3 : Powering Up your code implicitly in Scala

Reading Time: 5 minutes Hi Folks, In this series we talk about the concepts that provide a better definition to the code written in scala. We provide the methods with some definitions that lead to perform a task in a better way. Lets have a look at what we have done in the series so far, Effective Programming in Scala – Part 1 : Standardizing code in better way Continue Reading

Intercepting Nutch Crawl Flow with a Scala Plugin

Reading Time: 4 minutes Apache Nutch, is an open source web search project. One of the interesting things that it can be used for is a crawler. The interesting thing about Nutch is that it provides several extension points through which we can plugin our custom functionality. Some of the existing extension points can be found here. It supports a plugin system which is used in Eclipse as well. Continue Reading