Running Multinode Cassandra Cluster on a Single Machine

Table of contents
Reading Time: 5 minutes

Hey folks, in this blog I am going to discuss how can you set up a Cassandra cluster on a single machine. This situation is not ideal because being a scalable distributed NoSQL database, Cassandra cluster runs on machines/nodes spanning across different geographical places.

But if you are someone who has just started studying Cassandra and wants to get a feel of how the cluster is set up and or want to give a demonstration of Cassandra cluster without requiring any extra hardware then probably this blog can be helpful for you.

In this Blog, I will discuss two ways of setting up a multi-node Cassandra cluster  on a single machine:

  1. Using Cassandra Cluster Manager (CCM): In our first approach, we will take advantage of a tool called the Cassandra Cluster Manager or ccm, built by Sylvain Lebresne and several other contributors. This tool is a set of Python scripts that allows you to run a multi-node cluster on a single machine. It is available on GitHub. A quick way to get started with it is to clone the repository using Git. Open the terminal and run the following command:

    $ git clone https://github.com/pcmanus/ccm.git

    Then, to run the installation script with administrative-level privileges, run the following command:

    $ sudo ./setup.py install

    Once you’ve installed ccm, it should be on the system path.
    Now, let’s create a cluster using ccm:

    $ ccm create -v 3.0.0 -n 3 demo_cluster1 –vnodes

    This command creates a cluster based on the version of Cassandra we selected—in
    this case, 3.0.0. The name of the cluster is demo_cluster1 and has three nodes. We specify that we want to use virtual nodes because ccm defaults to creating single token nodes.
    Once you have created the cluster, you can see it is the only cluster in the list of clusters (and marked as the default), and you can learn about its status:

    $ ccm list

    Cassandra Cluster

    $ ccm status

    ccm2

    At this point, we have only created a cluster but not initialized any nodes, to initialize the nodes write the following command on terminal:

    $ ccm start

    This is the equivalent to starting each individual node using the bin/Cassandra script (or service start Cassandra for package installations). To dig deeper into the status of an individual node, we’ll enter the following command:

    $ ccm status

    ccm3

    Now enter the following command:

    $ ccm node1 status

    You should see something like this on the terminal:

    res1

    This is equivalent to running the command nodetool status on the individual
    node.
    The output shows that all of the nodes are up and reporting normal status
    (UN). Each of the nodes has 256 tokens and owns no data, as we haven’t inserted any data yet.

    We can run the nodetool ring command in order to get a list of the tokens owned
    by each node. To do this in ccm, we enter the command:

    $ ccm node1 ring

    The command requires us to specify a node. This doesn’t affect the output; it just
    indicates what node nodetool is connecting to in order to get the ring information.

    Now you can play around with this cluster in your machine 🙂

  2. Using configuration files: In this approach, we will create three Cassandra node instances on a single local machine to create the Cassandra cluster. First of all, if you don’t have Cassandra then download Apache Cassandra from apache and unzip the file. You can download the latest version of Cassandra from this link:

    http://www.apache.org/dyn/closer.lua/cassandra/3.11.2/apache-cassandra-3.11.2-bin.tar.gz

    Now, enter this command in the directory via terminal where you want to keep the Cassandra files:

    tar -xvf apache-cassandra-3.xx.x-bin.tar.gz

    Now, go inside the extracted Cassandra folder and make two copies of the conf folder- conf2 and conf3.

    Now, we will go inside the conf, conf2 and conf3 folder and make changes to the cassandra.yml file to make all the nodes of our cluster up and working. Cassandra.yaml is the main configuration file for Cassandra.

    Inside cassandra.yaml file we have to make the following changes:

    1. Name of the cluster (cluster_name) – All the three nodes must have same cluster name to be part of the same cluster.
    2. Data file directories(data_file_directories) – Give different paths to all three nodes so that so that all nodes can save data on different directories. I have given below the snapshots of the paths that I have given for the nodes:dekstop1node1desktop1node2desktop1node3
    3. Commit log directory(commitlog_directory) – Give different paths to all three nodes for commit_direstory as well below are the snapshots of the paths that I have given:
      desktop2node1
      desktop2node2desktop2node3
    4. Saved cache directory(change path):
      desktop3node1desktop3node2desktop3node3
    5. Listening address:
      desktop4node1
      desktop4node2desktop4node3
    6. rpc-address:
      desktop5node1
      desktop5node2desktop5node3

    Next, we will change the JMX_Port under cassandra.env. sh file for conf2 and conf3 folders.

    JMX_Port specifies the default port over which Cassandra will be available for JMX connections.

    Sample JMX_ Port for the three nodes:

    j3

    j2

    j1

    Now, in bin folder there is a cassandra.in.sh file, make two copies of it naming them cassandra2.in.sh and cassandra3.in.sh.

    Now, open cassandra2.in.sh and change cassandra_conf property.

    Similarly, change cassandra3in.sh.

    Finally, in bin folder there is a Cassandra file, make two copies of it naming them cassandra2 and cassandra3 and specify which config folder it has to use.

    After making all the changes now run all the instances of Cassandra on different terminals. Using commands:

    ./cassandra -f

    ./cassandra2 -f

    ./cassandra3 -f

    Now open another terminal and enter the following command :

    ./nodetool -h localhost -p 7199 status

    You should see something like this on the Terminal:

    result1.png
    This shows that there are three nodes with status Up and state Normal (UN).

References –

  1. Chapter 7, O’Reilly, Cassandra: The Definitive Guide, 2nd Edition, Jeff Carpenter and Eben Hewitt.
  2. https://www.youtube.com/watch?v=oHMJrhMtv3c.

Written by 

Manjot Kaur is a software consultant, having more than 0.5 years of experience. She likes to explore new technologies and trends in the IT world. Her hobbies include Travelling, watching movies and running. She is currently working on Technologies like scala with maven, dynamoDb, Akka-Http.

2 thoughts on “Running Multinode Cassandra Cluster on a Single Machine5 min read

  1. Hi Manjot ,

    Actually we have limited no of nodes but can get more CPU cores and RAM on same machine.
    Please suggest if we can use this step to setup multiple nodes on production.
    Eg-We have 5 Nodes ,can we setup 10 rings of cassandra by running 2 process of cassandra on each node.

    Help would be appreciated

  2. Similar situation
    i had 12 hosts x 4 different mounts on each host, i need to setup 4 nodes on each host making it 4×12 = 48 node cassandra cluster, i have 244gb ram, 32cpu and 1.5Tb -disks(4) on each host. Also, I have 4 IP’s on each host. Any recommendations how to design this cluster ?

Comments are closed.

Discover more from Knoldus Blogs

Subscribe now to keep reading and get access to the full archive.

Continue reading