Supervising Actors in Akka

Reading Time: 3 minutes

After going through the previous blogs, we are now familiar with Akka Actors, their implementation and the Ask pattern. In this blog, we are going to discuss about supervision and various supervision strategies. So, let’s begin.

What is Supervision?

In case of failure, rather than forcing it back on the caller(customer), we prefer to handle it internally. Within Akka, it is done using a technique called Supervision.

It describes a dependency relationship between actors, the parent and child relationship. Parent is unique because it has created the child actor, so the parent is responsible for reacting when failures happens in his child.

The parent decides from the below directives, depending on the nature of failure :

  • Resume: Simply resume message processing.
  • Restart: Transparently replace affected actor(s) with new instance(s) and then resume message processing.
  • Stop: Stop affected Actor(s) permanently.
  • Escalate: Escalate the failure by failing itself and propagate failure to its parent.

Failure in Akka

  • Akka deals with failure at the level of individual Actors.
  • An Actor fails when it throws an exception (NonFatal throwable).
  • Failure can occur –
    • during message processing
    • during initialization
    • within a lifecycle hook , e.g. preStart().

So, when a failure occurs in an Actor, then it remains isolated to the particular Actor. It does not propagate or takes down the entire system.

Fault Tolerance

  • Akka’s fault tolerance is implemented through Parental Supervision
    • If an Actor fails, its message processing is suspended.
    • Its children are suspended recursively, i.e., all descendants.
    • Its parent has to handle the failure.

Each Actor has a supervisor strategy for handling failure of child Actors. There is a default supervisor strategy in place. In most cases when we define our own supervisorStrategy , a val should be used to override supervisorStrategy.


class Test extends Actor{
override val supervisorStrategy: SupervisorStrategy = ...

Supervision Strategies

There are two type of supervision strategies that we follows to supervise any actor:

  1. One-For-One – Only the faulty child is affected when it fails.
  2. One-For-All – All children are affected when one child fails.

Below is the example of one – for – one strategy.

case object ResumeException extends Exception
case object StopException extends Exception
case object RestartException extends Exception

override val supervisorStrategy =
OneForOneStrategy(maxNrOfRetries = 10, withinTimeRange = 1 second){
case ResumeException => Resume
case RestartException => Restart
case StopException => Stop
case _: Exception => Escalate

Both are configured with a Decider. A decider maps specific failure to one of the possible directives. If not defined for some failure, the supervisor itself is considered faulty.

Restarting vs Resuming

Like stopping, these are also recursive operations. In both cases, no messages get lost, except for the faulty message if any.

Resuming simply resumes message processing for faulty actor and its descendants. The actor state remains unchanged. Resume is used if the state is still valid.

Restarting replaces the affected actor(s) with new instance(s). Actor state and behavior get reinitialized. It is used if the state is considered corrupted due to failure. By default, all children get stopped.

Default Supervisor Strategy

If we don’t override supervisorStrategy, a OneForOneStrategy with the following decider is used by default:

  • ActorInitializationException -> Stop
  • ActorKilledException -> Stop
  • DeathPactException -> Stop
  • Other Exceptions -> Restart
  • Other Throwables -> Escalates to it’s parent

Therefore, in many cases the Actor will be restarted by default.

Self Healing

Failure could easily stop a system from working properly, e.g., because messages or actor state get lost. Therefore, it is essential to build a self healing system. If the supervisor has enough information, it can reconstruct all state and resend all messages. If not, then we need other ways to heal.

This is all about supervision and the strategies. Stay tuned for more upcoming blogs on Akka.