Author Archives: sangeetagulia

Understanding HDFS Federation


In this blog, we will discuss about Hadoop federation, Hadoop architecture vs Hadoop Federated architecture and will talk about various issues solved by hdfs federation. So let us first see why it is gaining so much popularity. To address this … Continue reading

Posted in Scala | Tagged | 1 Comment

Zeppelin with Spark


Let us first start with the very first question, What is Zeppelin? It is a web-based notebook that enables interactive data analytics. Based on the concept of an interpreter that can be bound to any language or data processing backend, … Continue reading

Posted in big data, Scala, Spark, Tutorial | 1 Comment

Deep Dive into Spark Cluster Managers


This blog aims to dig into the different Cluster Management modes in which you can run your spark application. Spark applications run as independent sets of processes on a cluster, coordinated by the SparkContext object in your main program which … Continue reading

Posted in apache spark, big data, Scala, Spark | 1 Comment

Play around with Microservices


What is microservice architecture Microservice architecture is a method of developing software applications as a suite of independently deployable, modular services in which each service runs a unique process and communicates through a well-defined, lightweight mechanism to serve a business … Continue reading

Posted in Microservices, Play Framework, Scala | Leave a comment

Introducing Kafka Streams: Processing made easy


If you are working on huge amount of data, you might have heard about Kafka. At a very high level, Kafka is a fault tolerant, distributed publish-subscribe messaging system that is designed for fast processing of data and the ability … Continue reading

Posted in big data, Java, Streaming | Tagged , , , | 1 Comment

Working with Hadoop Filesystem Api


Reading data from and writing data to Hadoop Distributed File System (HDFS) can be done in a number of ways. Now let us start understanding how this can be done by using the FileSystem API, to create and write to … Continue reading

Posted in Java | Tagged , , | Leave a comment

Jenkins – Integrating Email Service


Jenkins is one open source tool to perform continuous integration and build automation. Using it, all development work can be integrated as early as possible. The resulting artifacts are automatically created and tested and as a result the process of … Continue reading

Posted in testing, Tutorial | Tagged | Leave a comment

Jenkins Build Jobs


In continuation to my previous blogs Introduction to Jenkins and Jenkins – Manage Security , I will now be talking about creating build jobs with Jenkins. It is easy and simple to create a new build job in Jenkins. Follow … Continue reading

Posted in integration, Performance Testing, testing, Tutorial | 2 Comments

Jenkins – Manage Security


Jenkins is one of a powerful continuous integration tool with a great community. It is an opensource tool and hence can be easily used by anyone. So why not to start knowing a tool like this. To read about the … Continue reading

Posted in Scala | 3 Comments

Hive Database : A basic Introduction


What is Hive? Hive is a data warehouse infrastructure tool which process structured data in Hadoop. It resides on top of Hadoop to summarize Big Data, and makes querying and analyzing easy. Why to use Hive? 1) Most of the … Continue reading

Posted in Scala | 2 Comments