Ganglia is cluster monitoring tool to monitor the health of distributed cluster of spark and hadoop. I know you all have question that we already have a Application UI (http://masternode:4040) and Cluster UI (http://masternode:8080) then why we need ganglia? So answer is, Spark cluster UI and application UI dont provide us all information related to our cluster like Network I/O and health of every node. And with the help of spark default monitoring we cant monitor whole cluster hardware health and all matrices for each parameter like cpu usage, ip addresses, memory etc. So we now we got the answer that ganglia use for Advance Monitoring of any cluster.
Now we see how Ganglia works and its internal architecture.
Ganglia have 3 main components as follows:
- gmond: gmond is monitoring deamon which collect data from each node in the cluster and send it to specific host.
- gmetad: gmetad is a metadata deamon to collect data from the monitoring deamons and store it into RRD(round robin database).
- Ganglia web interface: Its provide a web interface which aggregate all node data and represent in metrics form and we can monitor specific node of cluster.
So from above component we can draw the below architecture
In the above diagram the gmond collect data at each node and send to gmetad and use can monitor it with web-interface.
Now we see how to install ganglia and configure it:
To install ganglia we run following commands
sudo apt-get install ganglia-monitor rrdtool gmetad ganglia-webfrontend
Above command install ganglia monitor, rrd tool,gmetad and ganglia-webfrontend services.
Now to run ganglia on apache2 server we need to copy
/etc/apache2/sites-enabled/ganglia.conf, so run following command.
sudo cp /etc/ganglia-webfrontend/apache.conf /etc/apache2/sites-enabled/ganglia.conf
And on other nodes or we can say worker node we need to install ganglia monitor which is gmond deamon.
sudo apt-get install ganglia-monitor
Configure ganglia for spark cluster
We just set up our spark cluster on single machine for demo so after setup spark cluster we need to configure ganglia.
So we consider our gmetad is master of spark cluster so configure /etc/ganglia/gmetad.conf on master machine as follows
data_source “myCluster” 30 masternode:8649
In the above config data_source line is the name of the cluster , time after that host collect data here its 30 seconds and hostname which collect it.
And now we need to configure etc/ganglia/gmond.conf on each worker machine of spark cluster as follows
daemonize = yes
setuid = yes
user = ganglia
debug_level = 0
max_udp_msg_len = 1472
mute = no
deaf = no
host_dmax = 0
cleanup_threshold = 400
gexec = no
send_metadata_interval = 20
/* If a cluster attribute is specified, then all gmond hosts are wrapped inside
* of a <CLUSTER> tag. If you do not specify a cluster tag, then all <HOSTS> will
* NOT be wrapped inside of a <CLUSTER> tag. */
name = “myCluster”
owner = “clusterOwner”
latlong = “unspecified”
url = “unspecified”
/* The host section describes attributes of the host, like the location */
location = “unspecified”
/* Feel free to specify as many udp_send_channels as you like. Gmond
used to only support having a single channel */
host = localhost
port = 8649
ttl = 1
/* You can specify as many udp_recv_channels as you like as well. */
port = 8649
/* You can specify as many tcp_accept_channels as you like to share
an xml description of the state of the cluster */
port = 8649
Right now we have a single machine cluster so we set host as local host you can set master machine host in it, Configure gmond.conf on each worker machine.
ganglia use rrd so you need to create directory for rrd to store gmetad data on master machine as follows
sudo mkdir –p /var/lib/ganglia/rrds
sudo chown -R xyz /var/lib/ganglia/
Now we need to start services as follows
sudo service ganglia-monitor start
sudo service gmetad start
sudo service apache2 start
Now ganglia start collecting data of your cluster if you want to monitor your cluster go to localhost/ganglia and explore interesting information of spark cluster in web-interface like below
The above steps are basic monitoring with ganglia you can do many things with it like unicast,multicast, time interval , matrics etc by configurations.