0 comments on “Tuning spark on yarn”

Tuning spark on yarn


In this blog we will learn how to tuning yarn with spark in both mode yarn-client and yarn-cluster,the only requirement to get started is that you must have a hadoop based yarn-spark cluster with you. In case you want to…

0 comments on “A Step-by-step guide for setting MultiNode Mesos Cluster with Spark and Hdfs on EC2”

A Step-by-step guide for setting MultiNode Mesos Cluster with Spark and Hdfs on EC2


Apache Mesos is open source project for managing computer clusters originally developed at the University Of California. It sits between the application layer and operating system to manage the application works efficiently on the large-scale distributed environment. In this blog,…

1 comment on “How Does Spark Use MapReduce?”

How Does Spark Use MapReduce?


In this talk we will talk about a interesting scenario did spark use mapreduce or not?answer to the question is yes,it use mapreduce but only the idea not the exact implementation lets talk about a example to read a text…

2 comments on “How To Use Hive With Out Hadoop”

How To Use Hive With Out Hadoop


Reason for writing this blog is to answer the Most Common Question Can We use Hive With Out hadoop,so lets started it answer is yes Starting with release 0.7, Hive also supports a mode to run map-reduce jobs in local-mode…

1 comment on “How to query external hive Metastore From Spark”

How to query external hive Metastore From Spark


In this Blog we will learn how can we access tables from hive metastore in spark,so now just lets get started start your hive metastore as  as service with following command hive --service metastore by default it will start metastore…

4 comments on “Spark On Mesos(Installation)”

Spark On Mesos(Installation)


In this Article We Will Learn How to Use Mesos On spark,so lets get started all you required is spark on your machine as a prerequisite,here are the steps to configure 1.Download Latest Mesos Version from here 2.extract the jar…

3 comments on “Why Dataset Over DataFrame?”

Why Dataset Over DataFrame?


In this Blog We Will Learn What is Really The Advantage That Dataset Api in spark 2 has over Dataframe api DataFrame is weakly typed and developers aren't getting the benefits of the type system thats why the Dataset Api…

1 comment on “Create Your Own MetastoreEvent Listeners in Hive With Scala”

Create Your Own MetastoreEvent Listeners in Hive With Scala


HIve MetaStore Event Listeners are used to Detect the every single event that takes place whenever an event is executed in hive, in case You want some action to take place for an event you can override MetaStorePreEventListener and provide it your own…

0 comments on “How To Use Vectorized Reader In Hive”

How To Use Vectorized Reader In Hive


Reason For Writing This Blog is That  I tried to use Vectorized Reader In Hive But Faced some problem with its documentation,thats why decided to write this block Introduction Vectorized query execution is a Hive feature that greatly reduces the…

0 comments on “Play-Spark2 A simple Application”

Play-Spark2 A simple Application


In This Blog We Will Create  a very simple application with Play FrameWork And Spark. Apache Spark is a fast and general-purpose cluster computing system. It provides high-level APIs in Scala, Java, and Python that make parallel jobs easy to…