MapReduce

Hadoop Word Count Program in Scala

Reading Time: 2 minutes You must have seen Hadoop word count program in java, python or in c/c++ but probably not in Scala. so, lets learn how to build Hadoop Word Count Program in Scala. Submitting a Job to Hadoop which is written in Scala is not that easy, because Hadoop runs on Java so, it does not understand the functional aspect of Scala. For writing Word Count Program Continue Reading

Introduction To Hadoop Map Reduce

Reading Time: 4 minutes In this Blog we will be reading about Hadoop Map Reduce. As we all know to perform faster processing we needs to process the data in parallel. Thats Hadoop MapReduce Provides us. MapReduce :- MapReduce is a programming model for data processing. MapReduce programs are inherently parallel, thus putting very large-scale data analysis into the hands of anyone with enough machines at their disposal.MapReduce works Continue Reading

Apache PIG : Installation and Connect with Hadoop Cluster

Reading Time: 4 minutes Apache PIG, It is a scripting platform for analyzing the large datasets. PIG is a high level scripting language which work with the Apache Hadoop. It enables workers to write complex transformation in simple script with the help PIG Latin. Apache PIG directly interact with the data in Hadoop cluster. Apache PIG transform Pig script into the MapReduce jobs so it can execute with the Continue Reading

Let Us Grid Compute

Reading Time: 3 minutes Since early times oxen were used for heavy pulling. Sometimes the logs were huge and an oxen could not pull it. The smart people from the earlier times did not build a bigger ox. Instead they used two or three together. Simple, isn’t it? It is the same concept which has gone behind the use of multiple commodity hardware linked together to provide super processing Continue Reading