Understanding laws of scalability and the effects on a distributed system

Reading Time: 4 minutes

A reactive system primarily focuses on responsiveness, elasticity, message-driven, and resiliency as its core features.


Elasticity is one of the main components in the reactive manifesto. An elastic system has the ability to scale up or scale down when there is an increase/decrease in demand while remaining responsive.

Scenarios where a system needs to improve the throughput or needs to handle more concurrent users, we tend to add concurrency by scaling up the system and tries to find improvements. But does scaling up systems really helps to improve?
In this blog, we’ll try to understand what happens when we scale up a system(often termed as scalability) and the impacts on the system due to laws of scalability.

Performance vs Scalability

Let’s begin by understanding why scalability is preferred in Reactive Systems?

To estimate the throughput of a system, we use metrics like request per second, which gives an idea of how many requests the system can handle at a time.

But the measurement depends on two factors-

  1. Response time: Time to handle a single request.
  2. Load(Number of requests): No. of requests that can be handled at a time.

Throughput = response time * load


To improve the overall throughput, we may increase response time or the ability to handle the load.

Here, performance and scalability come into the picture where Performance tends to optimize the response time that the system is taking to handle a single request and Scalability is about improving the number of requests that can be handled at a time.

Let’s understand this with an example.

Problem Statement

A system takes a second to process a request, i.e request per second = 1. The objective is to increase the throughput to 2 requests per second.


To increase the throughput, we can either-

  • Improve response time: Improve the performance by optimizing the processing such that one request now takes half of the initial time. I.e, 0.5 seconds. So that 2 requests can be processed in 1 second.
  • Improve the load: Scale up the system to handle 2 requests in parallel to get the desired throughput.

Theoretically,  improving response time has a limit, and we will eventually reach the limit and further improvements maybe physically unrealistic, but we can push the scalability forever. However, practically scalability too has a limit.

Now since we know that we can push the scalability forever, that’s why we tend to focus on improving scalability as compared to performance in reactive systems.

Well, Does it mean scaling up will definitely improve the ability to handle the load of the system? Does scaling up any system really helps to improve?

To understand this, let’s have a look at the laws of scalability which explains the after effects of scaling up a distributed system.

Laws of Scalability

Amdahl’s Law

Amdahl’s law defines the maximum improvement gained by parallel processing and shows that improvements from parallelization are limited to the code that can be parallelized.

If a system has resources that are single, they will tend to have contention around them. Contention means multiple processes will compete or wait to use that single limited resource.

So, no matter how much we scale up the system, the processes will always contend for this single resource, and hence contention limits parallelization and reduces improvements.


The graph shows that as we try to scale up(adding more machines or processors) to achieve more throughput we see diminishing returns due to the contention and the ideal linear scalability is never achieved.

Gunther’s Universal Scalability Law

In addition to the contention, coherency delay is another factor which affects the improvements.

In a distributed environment, the state of the system is shared across multiple nodes using crosstalk or gossip(Each node shares the state by communicating with every other node) so that they all agree on a same state of the system, and the time required to do this synchronization is defined as the Coherency Delay.


So, for 2 nodes, the communication channel would be 1.
And for 3 nodes, communication channels would be 3.

So as we increase the number the nodes in the system, we increase the coherency delay.


Gunther’s law showed that when we have coherency delay, not only we get diminishing returns we also begin to have negative returns. And as we scale up the system, the communication and coordination cost even gets bigger and exceeds any benefits that were expected.


It’s not always necessary that scaling up leads to better performance, sometimes we get diminishing returns, or may get even worse returns.


The laws of scalability Amdahl’s Law and Gunther’s Law show us that linear scalability is almost always unachievable. But in order to build reactive systems, we require to minimize the impact of laws of scalability.

Contention can be reduced by isolating locks or by avoiding blocking operations. And, the impact of Gunther’s Universal Scalability Law can be reduced by building autonomous system to reduce the communication and the amount of coherency delay involved.



Written by 

Neha is a Senior Software Consultant with an experience of more than 3 years. She is a Big data enthusiast and knows various programming languages including Scala and Java. She is always eager to learn new and advance concepts to expand her horizons and apply them in the project development.