Akka Cluster in use (Part 9): Effectively Resolving Split Brain Problem

Table of contents

Reading Time: 7 minutes

We can deal with cluster failures manually, as mentioned in our blog post Manually Healing an Akka Cluster. However, it requires a DevOps engineer to be available 24 x 7. This process is very expensive and can be very frustrating (sometimes).

Hence, we really need a way via which we can resolve cluster failures in an automated fashion. This is where a Split Brain Resolver (SBR) comes into picture.

What is a Split Brain Resolver?

In our previous blog post, What is a Split Brain?, we experienced a Split Brain problem. Hence, a SBR is nothing but a set of customizable strategies which terminate unreachable members in an automated fashion, in order to avoid a Split Brain failure. This allows the cluster to maintain data consistency in case a Split Brain issue occurs.

How to configure it?

Configuring a SBR is quite easy. All we have to do is add the downing-provider-class in the Akka cluster configurations:

akka.cluster.downing-provider-class = "com.swissborg.lithium.DowningProviderImpl"

Here we are using Lithium – A split-brain resolver for Akka-Cluster which works with all Akka versions and is open sourced. For using it we have to add following dependency to our build.sbt file too:

libraryDependencies += "com.swissborg" %% "lithium" % "0.11.2"

However, Lightbend recently open sourced Akka Cluster SBR. But it works only with Akka version 2.6.6+. So if you are using Akka version 2.6.6+, then only you will be able to use Akka Cluster SBR.

SBR Strategies

1. static-quorum

static-quorum is one of the strategies of a SBR. It uses a fixed size quorum of nodes. Hence if a any problem occurs, with the cluster, all the nodes will evaluate their status, i.e., if any node is Unreachable then it will Down itself. And if the quorum-size is not reachable, then they will terminate themselves. Generally the quorum-size is always greater than 1/2 of the number of nodes in the cluster. Because if we set the quorum-size = 1, then each node (Node 1, 2, and 3) can form a healthy cluster of it’s own.

For example, in the above diagram, the Cluster 1 can communicate with 2 nodes, i.e., it can maintain a quorum-size of 2, it will keep on functioning properly. Whereas the Cluster 2 cannot maintain the quorum-size of 2, it will terminate (Down) the Node 2.

Although, static-quorum is a good strategy for a cluster of fixed (static) nodes, it needs manual settings (port & host) for forming a cluster.

We can configure static-quorum in the following way:

com.swissborg.lithium {
  stable-after = 5s
  active-strategy = "static-quorum"
  static-quorum {
    # Minimum number of nodes in the surviving partition.
    quorum-size = 2
    
    # Only take in account nodes with this role.
    role = ""
  }
}

2. keep-majority

Another strategy is keep-majority. It is similar to static-quorum strategy in some ways, i.e., it tracks the size of the cluster and uses majority parameter to determine the action to take. Whereas static-quorum, uses a pre-configured size of the quorum to determine the action to take, keep-majority uses the size of the cluster to take an action, hence it is dynamically figuring out the size of the cluster.

In this strategy, if the nodes can communicate with a majority of the nodes, they will Down any Unreachable node(s), similar to static-quorum. And if the nodes, cannot fulfill the majority criteria, they will terminate themselves.

For example, in the above diagram, the Cluster 1 can achieve the majority of 2, hence Nodes 1 & 2 will form a healthy cluster. Whereas, the Cluster 2 cannot achieve the majority, it will be terminated (Down).

However, for dynamically determining the majority of a cluster, which is an advantage as we do not have to pre-configure it like in static-quorum, keep-majority strategy has to re-calculate majority whenever a node is added or removed from the cluster. Also, the number of nodes in the cluster have to be maintained to an odd number, otherwise 2 healthy clusters can be formed which can cause data inconsistency.

We can configure keep-majority in the following way:

com.swissborg.lithium {
  stable-after = 5s
  active-strategy = "keep-majority"
  keep-majority {
    # Only take in account nodes with this role.
    role = ""
  }
}

3. keep-oldest

keep-oldest is an another strategy of the SBR. It monitors the oldest node in the cluster. Basically it looks at the time when a node joins the cluster and picks the node which has joined the cluster first. Hence, if a node can communicate with the oldest node, then they will form a healthy cluster. Otherwise, they will be marked as Down. Also, they will terminate themselves.

For example, in the above diagram, Node 2 joined the cluster first, hence it is oldest node. Since both, Nodes 1 & 3 are not able to communicate with the Node 2, they will be terminated in this strategy.

Although keep-oldest strategy keeps a determined portion of the cluster alive, i.e., the one which has the oldest node. It can lead to downing a large portion of the cluster, like in the above-diagram. Also, it has a single point of failure, i.e., if the oldest node crashes, the whole cluster will be terminated. But that can be contained via configuration, i.e., if only the oldest node crashes then do not terminate the whole cluster.

We can configure keep-oldest in the following way:

com.swissborg.lithium {
  stable-after = 5s
  active-strategy = "keep-oldest"
  keep-oldest {
    # Down the oldest member when alone.
    down-if-alone = no
    
    # Only take in account nodes with this role.
    role = ""
  }
}

4. keep-referee

Another strategy which is similar to keep-oldest is keep-referee. The only difference is, keep-referee designates a specific node as the referee (based on it’s address). Hence this strategy is statically configured, where we can designate a node as refree, unlike the keep-oldest. However, similar to the keep-oldest, if a node can communicate with the referee node, then they will form a healthy cluster. Otherwise, they will be marked as Down. Also, they will terminate themselves.

For example, in the above diagram, Node 2 is the referee node. Since both, Nodes 1 & 3 are not able to communicate with the Node 2, they will be terminated in this strategy.

Like keep-oldest, keep-referee strategy keeps a determined portion of the cluster alive, i.e., the one which has the referee node. It can lead to downing a large portion of the cluster, like in the above-diagram. Also, it has a single point of failure, i.e., if the referee node crashes, the whole cluster will be terminated.

We can configure keep-referee in the following way:

com.swissborg.lithium {
  stable-after = 5s
  active-strategy = "keep-referee"
  keep-referee {
    # Address of the member in the format "akka://system@host:port"
    referee = null
    
    # Minimum number of nodes in the surviving partition.
    down-all-if-less-than-nodes = 1
  }
}

5. down-all

A very conservative strategy is down-all strategy. It makes the easiest decision possible, i.e., if any node becomes unreachable then all nodes will terminate themselves.

For example, in the above diagram, if Node 2 becomes Unreachable for Nodes 1 & 3 then all the nodes will terminate themselves.

Although this strategy is the easiest of the all, it results into frequent cluster shutdown, especially in the large clusters (>10 nodes), since the probability of a node crashing is more in a large cluster. Hence this strategy is not recommended for the clusters having more than 10 nodes.

We can configure down-all in the following way:

com.swissborg.lithium {
  stable-after = 20s
  down-all-when-unstable = on
}

Note: It is by default derived from stable-after to be stable-after + 3/4 stable-after. However, if you want to override it, then it must be less than 2 * stable-after.

Indirectly Connected Nodes

Split Brain problem can occur not only because of the nodes which are marked as Down but still functional. But also because of indirectly connected node(s). An indirectly-connected node is one that has been detected or has detected another member as unreachable but that is still connected to via some other nodes.

For example, in the above diagram, Node 4 is connected to only Node 3, but not to Node 1 or 2. Hence only for Node 3, Node 4 is reachable, whereas for Nodes 1 & 2 it is unreachable.

These nodes will always be downed in combination with the ones downed by the configured strategy.

Experiencing Split Brain Resolver

In our last blog post, What is a Split Brain?, we experienced a split brain scenario. Now we will experience a SBR in action. For that first form a cluster by following the steps mentioned in the blog: Setup a Local Akka Cluster. Once the cluster is formed, add a third node to the cluster to achieve a quorum of at least 2 nodes, since we will be using static-quorum strategy:

sbt 'run -Dakka.http.server.default-http-port=8002 -Dakka.remote.artery.canonical.port=2553 -Dakka.management.http.port=8560'

Next, to create a network partition scenario, let us pause the second node of the cluster:

$ ./pauseNode.sh 2552

Now, let’s check on the first node (http://localhost:8558/cluster/members) to verify that the second has become unreachable:

Now the remaining two nodes, i.e., Nodes 1 & 3, will form a quorum. They will make the decision to down the unreachable node, i.e., the second node and remove it from the cluster. This can be confirmed from the first node (http://localhost:8558/cluster/members).

Since both the nodes, 1& 3, have formed a healthy cluster of their own, let’s un-pause the second node:

$ ./resumeNode.sh 2552

The resumed second node will start responding again, however, it will realize that it has been orphaned from the remaining two nodes, 1 & 3. Because it is not part of the quorum, it will choose to shut itself down.