2 comments on “Understanding HDFS Federation”

Understanding HDFS Federation


In this blog, we will discuss about Hadoop federation, Hadoop architecture vs Hadoop Federated architecture and will talk about various issues solved by hdfs federation. So let us first see why it is gaining so much popularity. To address this…

2 comments on “Zeppelin with Spark”

Zeppelin with Spark


Let us first start with the very first question, What is Zeppelin? It is a web-based notebook that enables interactive data analytics. Based on the concept of an interpreter that can be bound to any language or data processing backend,…

2 comments on “Deep Dive into Spark Cluster Managers”

Deep Dive into Spark Cluster Managers


This blog aims to dig into the different Cluster Management modes in which you can run your spark application. Spark applications run as independent sets of processes on a cluster, coordinated by the SparkContext object in your main program which…

0 comments on “Play around with Microservices”

Play around with Microservices


What is microservice architecture Microservice architecture is a method of developing software applications as a suite of independently deployable, modular services in which each service runs a unique process and communicates through a well-defined, lightweight mechanism to serve a business…

1 comment on “Introducing Kafka Streams: Processing made easy”

Introducing Kafka Streams: Processing made easy


If you are working on huge amount of data, you might have heard about Kafka. At a very high level, Kafka is a fault tolerant, distributed publish-subscribe messaging system that is designed for fast processing of data and the ability…

0 comments on “Working with Hadoop Filesystem Api”

Working with Hadoop Filesystem Api


Reading data from and writing data to Hadoop Distributed File System (HDFS) can be done in a number of ways. Now let us start understanding how this can be done by using the FileSystem API, to create and write to…

0 comments on “Jenkins – Integrating Email Service”

Jenkins – Integrating Email Service


Jenkins is one open source tool to perform continuous integration and build automation. Using it, all development work can be integrated as early as possible. The resulting artifacts are automatically created and tested and as a result the process of…

2 comments on “Jenkins Build Jobs”

Jenkins Build Jobs


In continuation to my previous blogs Introduction to Jenkins and Jenkins - Manage Security , I will now be talking about creating build jobs with Jenkins.It is easy and simple to create a new build job in Jenkins. Follow the…

3 comments on “Jenkins – Manage Security”

Jenkins – Manage Security


Jenkins is one of a powerful continuous integration tool with a great community. It is an opensource tool and hence can be easily used by anyone. So why not to start knowing a tool like this. To read about the…

2 comments on “Hive Database : A basic Introduction”

Hive Database : A basic Introduction


What is Hive? Hive is a data warehouse infrastructure tool which process structured data in Hadoop. It resides on top of Hadoop to summarize Big Data, and makes querying and analyzing easy. Why to use Hive? 1) Most of the…