Understanding laws of scalability and the effects on a distributed system

Reading Time: 4 minutes

A reactive system primarily focuses on responsiveness, elasticity, message-driven, and resiliency as its core features.

Reactive-Manifesto.png

Elasticity is one of the main components in the reactive manifesto. An elastic system has the ability to scale up or scale down when there is an increase/decrease in demand while remaining responsive.

Scenarios where a system needs to improve the throughput or needs to handle more concurrent users, we tend to add concurrency by scaling up the system and tries to find improvements. But does scaling up systems really helps to improve?
In this blog, we’ll try to understand what happens when we scale up a system(often termed as scalability) and the impacts on the system due to laws of scalability.

Performance vs Scalability

Let’s begin by understanding why scalability is preferred in Reactive Systems?

To estimate the throughput of a system, we use metrics like request per second, which gives an idea of how many requests the system can handle at a time.

But the measurement depends on two factors-

  1. Response time: Time to handle a single request.
  2. Load(Number of requests): No. of requests that can be handled at a time.

Throughput = response time * load

scalability_throughput

To improve the overall throughput, we may increase response time or the ability to handle the load.

Here, performance and scalability come into the picture where Performance tends to optimize the response time that the system is taking to handle a single request and Scalability is about improving the number of requests that can be handled at a time.

Let’s understand this with an example.

Problem Statement

A system takes a second to process a request, i.e request per second = 1. The objective is to increase the throughput to 2 requests per second.

Solution

To increase the throughput, we can either-

  • Improve response time: Improve the performance by optimizing the processing such that one request now takes half of the initial time. I.e, 0.5 seconds. So that 2 requests can be processed in 1 second.
  • Improve the load: Scale up the system to handle 2 requests in parallel to get the desired throughput.

Theoretically,  improving response time has a limit, and we will eventually reach the limit and further improvements maybe physically unrealistic, but we can push the scalability forever. However, practically scalability too has a limit.

Now since we know that we can push the scalability forever, that’s why we tend to focus on improving scalability as compared to performance in reactive systems.

Well, Does it mean scaling up will definitely improve the ability to handle the load of the system? Does scaling up any system really helps to improve?

To understand this, let’s have a look at the laws of scalability which explains the after effects of scaling up a distributed system.

Laws of Scalability

Amdahl’s Law

Amdahl’s law defines the maximum improvement gained by parallel processing and shows that improvements from parallelization are limited to the code that can be parallelized.

If a system has resources that are single, they will tend to have contention around them. Contention means multiple processes will compete or wait to use that single limited resource.

So, no matter how much we scale up the system, the processes will always contend for this single resource, and hence contention limits parallelization and reduces improvements.

scalability_amdahl's_law

The graph shows that as we try to scale up(adding more machines or processors) to achieve more throughput we see diminishing returns due to the contention and the ideal linear scalability is never achieved.

Gunther’s Universal Scalability Law

In addition to the contention, coherency delay is another factor which affects the improvements.

In a distributed environment, the state of the system is shared across multiple nodes using crosstalk or gossip(Each node shares the state by communicating with every other node) so that they all agree on a same state of the system, and the time required to do this synchronization is defined as the Coherency Delay.

scalability_coherency_delay

So, for 2 nodes, the communication channel would be 1.
And for 3 nodes, communication channels would be 3.

So as we increase the number the nodes in the system, we increase the coherency delay.

scalability_gunther's_law

Gunther’s law showed that when we have coherency delay, not only we get diminishing returns we also begin to have negative returns. And as we scale up the system, the communication and coordination cost even gets bigger and exceeds any benefits that were expected.

Conclusion

It’s not always necessary that scaling up leads to better performance, sometimes we get diminishing returns, or may get even worse returns.

scalability_laws_of_scalability

The laws of scalability Amdahl’s Law and Gunther’s Law show us that linear scalability is almost always unachievable. But in order to build reactive systems, we require to minimize the impact of laws of scalability.

Contention can be reduced by isolating locks or by avoiding blocking operations. And, the impact of Gunther’s Universal Scalability Law can be reduced by building autonomous system to reduce the communication and the amount of coherency delay involved.

References


knoldus-advt-sticker

Written by 

Neha is a Software Consultant having experience of more than 1.5 years . She is a Java enthusiast and has the knowledge of various programming languages. She is familiar with Object Oriented Programming Paradigms. She is always eager to learn new and advance concepts in order to expand her horizon and apply them in project development with her existing knowledge.Her hobbies include Coding , Writing technical blogs , Watching Reality Shows in spare time.

Knoldus Pune Careers - Hiring Freshers

Get a head start on your career at Knoldus. Join us!