Akka Cluster Formation Fundamentals

integrating Cucumber with Akka-Http
Reading Time: 3 minutes

Akka Cluster Formation

Every actor has an address in Akka. The actor could be present locally or could be remote. Remote Actors require communication over the network.

Each Actor system in a cluster is called a member or node. Node is addressed by a combination of hostname, port, and UUID (Regenerated when Actor System restarted). An actor can join the cluster with this combination to have Akka Cluster Formation.

Akka actor system

A seed node is required to join a cluster. Any member in the cluster can be a seed node. Seed nodes are no longer used as soon as a node has fully joined and accepted.

Joining node

Akka Management

Akka management is essential to maintain and monitor a running cluster. It can use to:

  • Remove a member from the cluster.
  • Check the health of members in the cluster.

Start an Akka management : AkkaManagement(actorSystem).start()

Cluster Communication

The state in the cluster is communicated via gossip protocol. This includes information about:

  • The status of a member in a cluster i.e. Joining, Up, etc.
  • If each member has seen this version of the cluster state (true/false).

Every member will see the same state in the stable cluster which is called Convergence.

The leader will be the first eligible node in the sorting set. It is responsible for marking a Joining node to Up.

Cluster Failure

Node Lifecycle
  • A node can be disconnected due to some reason, which is then marked as Unreachable.
  • If the disconnection persists, the node is marked as Down.
  • The downed node is permanently Removed from the cluster, and cannot rejoin.

A new member cannot rejoin until and unless an Unreachable member(s) is Reached again or being marked as Down.

Convergence is not possible if a member is Unreachable. A joining member would be marked as WeaklyUp and can not host cluster shards. A joining member is marked as Up when Convergence is reached.

Weakly Up

It is important to terminate the Unreachable node. If not terminated properly, it can continue to operate. It can form its own independent cluster, unaware that the node has been removed from the cluster.

Improper Downing of a node can lead to Split Brain Problem.

Split Brain Problem

Split Brain

The most common cause of split-brain is Auto-Downing and can be enabled with:

akka.cluster.auto-down-unreachable-after = 5 seconds

Auto Downing doesn’t let the Unreachable node to terminate which in many cases leads to Split Brain. This causes data inconsistency and data corruption.

The Lightbend Split Brain Resolver is the solution that is a part of the Lightbend ecosystem. It includes some customizable strategies to terminate a member in order to avoid a split-brain problem.

Please go through the blog written by Piyush Rana for a tour of various strategies that Lightbend Split Brain Resolver provides. Till then Stay Tuned!!



Written by 

Charmy is a Software Consultant having experience of more than 1.5 years. She is familiar with Object Oriented Programming Paradigms and has familiarity with Technical languages such as Scala, Lagom, Java, Apache Solr, Apache Spark, Apache Kafka, Apigee. She is always eager to learn new concepts in order to expand her horizon. Her hobbies include playing guitar and Sketching.