Understanding Failover Scenario in Cloud

Reading Time: 3 minutes

Hi Readers, as the world is growing rapidly and IT industry have most of the responsibility to manage several businesses. Companies are constantly having increased number of customers and users on their platform and this brings a responsibility to provide good experience to the users.

With increased numbers of users, it brings heavy load to the servers as well. In that case failover is something which every company should adopt. In this blog we will understand several scenarios related to failover and how it can be implemented.

What does Failover mean in Cloud?

  • When an application is deployed to cloud, the application is connected to a server. There are several scenarios to keep in mind at that time,
    • What if server goes down?
    • Similarly, what if the zone have some issues with it?
  • In addition, failover scenario includes automatic switching of the application load to a back up system. This back up system is mostly a secondary zone.

Why do we need to take care of Failover ?

  • Above all each and every business wants to provide a quality service to their users. And for that the application should be highly available and highly resilient.

High Availability

  • The main concept of high availability is to maintain a minimum downtime when either the zone or instance become unavailable.
  • Concept of primary and secondary zone within a region where the instance is set up.
  • Primary zone is a zone where our VM is deployed initially.
  • However if this zone goes unavailable or the VM crashes somehow. In addition if we have failover set up, the standby instance becomes the primary instance and users are routed to this instance.
  • This scenario of re routing user traffic to another zone is called high availability.

Resilient

  • In order to make a application resilient, we need to make sure that the application should continue to function despite of system failure.
  • However, resilience needs planning at every levels. Network and infrastructure needs to be designed for the application and storage.

What can cause Failover ?

There are many reasons which can cause Failover to trigger. Some could be,

  • Zone failure
  • VM Crashes

How we can implement Failover ?

Normal Scenario – No Failover

Failover Scenario

Environment variables

  • We can set a environment variable for triggering failover in our infrastructure configuration. We will enable it only in case of Failover.
TRIGGER_FAILOVER <=> true/false

Terraform Code

In our instance.tf file, we will need to mention this variable name inside labels block.

labels = {
    "app-name"       = "test-failover"
    "failover"       = var.TRIGGER_FAILOVER
  }

Trigger Failover in Production

  • There could be many other ways to handle failure. But one easy and simpler way is to change secret value.
  • For example TRIGGER_FAILOVER = true and re-deploy terraform configuration from the pipeline. This would destroy existing instance in primary zone and standby instance would take its place.

Conclusion

In conclusion, we can achieve high availability for our applications and make it more reliable for the users. To understand more about high availability see this. And for downtime and recovery time during failure see this blog.

That’s all for this blog. Hope you found some information from this. In case of any queries you can contact me over my email id nitin.mishra@knoldus.com. To read my other blogs click this.

References

Written by 

Nitin Mishra is a Software Consultant at Knoldus Software LLP. He has done MCA from GGSIPU and completed Bachelor of Science in Computer Science from Delhi University. He is a tech enthusiast with good knowledge of Java. He is majorly focused in DevOps practice. On personal front he loves to travel mountains and writes poetry.

Leave a Reply