Blockchain Nuggets – Distributed Systems

The first law of distribution is “Don’t Distribute“. If everything can be done on a single machine then nothing beats that. But, lets cut to reality.

Today’s computing and information systems are inherently distributed. Starting from large data centers to cloud storage to your mobile phone, everything is a part of the distributed infrastructure. Distribution, just like distributed development teams is here to stay because of

  • Geography – We don’t live in the same location
  • Parallelism – speed up computation
  • Reliability – replication, and redundancy to avoid loss
  • Availability – replication leads to faster access, fault tolerance, and low latency

Even though distributed systems lead to multiple benefits, there are a large number of issues that need to be tackled. Even though every single node of a distributed system will only fail once every few years, with millions of nodes, you can expect a failure every minute. Did not see that coming!

The basis of this failure is that the participants or their communication medium may experience failures.

  • Processors are prone to failures because, each participants processor would work at a different speed, may crash, may rejoin after the crash
  • The network might result in message loss, delays, lost, reordered, duplicated, there may be a network partition with n/2 nodes forming a separate network.

What are the issues that plague a distributed system?

I) A simple example, One client and One server

Screenshot from 2018-03-01 20-49-31

A simple scenario with just a single client and a server are fraught with challenges such as Message Loss. The client has no way to know if the message reached the server. Even if it reached the server, was it corrupted or in the original order. For the client to know that the message reached, the server can send out an ACK but what if the ACK is lost?

II) The problem becomes more involved when there are more nodes involved.

Screenshot from 2018-03-01 20-53-12

Here, Client 1 and Client 2 want to change the value of some variable, say score, on the server which is initially 0(zero)

Client 1 would like to do score=score+10

Client 2 would like to do score=score*5

Depending on Message Delay and the network between the client and server, either the message from Client 1 reaches the server first or second. Hence the value of score is either

score = (0+10)*5 = 50 or score = (0*5)+10 = 10

III) The complexity increases multifold when there are more servers involved and we are looking at State Replication. State replication is a fundamental property for distributed systems.

Screenshot from 2018-03-01 23-37-29

A set of nodes achieves state replication if all nodes execute a (potentially infinite) sequence of commands c 1 , c 2 , c 3 , . . . , in the same order.

State Replication is synonymous with Blockchain!

In order to achieve State Replication, there are a few ways

Two-Phase Protocol – One way is to have the client get a lock on all the servers. It asks for a lock. If all servers return a lock that they are only tuned in to this client then the client sends the command and releases the lock. Else, it waits for all locks to be available and keeps trying. Here, in the first phase the transaction is prepared, and then in the second, either it is committed or rolled-back.

Screenshot from 2018-03-01 23-47-14

The pitfall here is that all the servers must be responsive since the client has to get a lock on all of them before issuing the command. Also, if one of the servers crashes then we have an issue since we would be able to get a lock on that!

What if we do not need all the servers to be responsive, what if we were able to execute our command from the client on the basis of getting a lock on a majority (not all) the servers?

This lays the foundation for the Paxos protocol which is used to solve consensus in a network of unreliable processors. We discuss Paxos in the next post.



Written by 

Vikas is the CEO and Co-Founder of Knoldus Inc. Knoldus does niche Reactive and Big Data product development on Scala, Spark, and Functional Java. Knoldus has a strong focus on software craftsmanship which ensures high-quality software development. It partners with the best in the industry like Lightbend (Scala Ecosystem), Databricks (Spark Ecosystem), Confluent (Kafka) and Datastax (Cassandra). Vikas has been working in the cutting edge tech industry for 20+ years. He was an ardent fan of Java with multiple high load enterprise systems to boast of till he met Scala. His current passions include utilizing the power of Scala, Akka and Play to make Reactive and Big Data systems for niche startups and enterprises who would like to change the way software is developed. To know more, send a mail to or visit

Leave a Reply

%d bloggers like this: