SRE

Incident metrics

Understanding few common incident metrics

Reading Time: 3 minutes Hi Readers, In current market scenario we are seeing most of the businesses going online and their applications are having customer footfall increasing only day by day. This has obviously increased profit in businesses. However, with all these benefits, they are also facing many issues as well just like outages, engineering incidents, service glitches, application downtime and many more. As these also lead to missed Continue Reading

gcp operation suite

Dashboards and alerts in GCP Cloud Operations

Reading Time: 3 minutes Hi Readers, In previous blog we understood the need and role of GCP Cloud monitoring, logging, error debugging and so on. In this blog we will see how we can work with GCP Cloud Operations to create dashboards and alerts to get notified in case of any event happening against our set alert policy. Step-1 :Create a simple instance with Firewalls to allow HTTP traffic Continue Reading

Introduction to Google Cloud Operations Suite

Reading Time: 2 minutes Hi Readers, Once we have deployment our code to production and the application is in use by large number of users we want to make sure that our application is reliable and highly available to the users. However anything we try, there would be always be some chances of failures. We can’t remove them but we can definitely follow several things to minimise it. In Continue Reading

gcp operation suite

Google Cloud Operations Suite

Reading Time: 6 minutes Introduction Google Cloud’s operations suite (formerly Stackdriver) is a set of tools to help you monitor, debug, and trace your applications and infrastructure running in Google Cloud Platform (GCP) to ensure good performance and availability.   What is the operations suite? Google Cloud’s operations suite is made up of products to monitor, troubleshoot and operate your services at scale, enabling your DevOps, SREs, or ITOps Continue Reading

SRE – Service Level Terminology

Reading Time: 3 minutes Before going to discuss the different Service Level Terminology. Lets have look at what is SRE in a very short term. SRE is discipline  that happens when a software engineer is put to solve operations problems. Service Level Terminology: In this world, we know that billions of people use different services on daily basis. It may be paid or can be unpaid service, maybe any. Continue Reading

SRE: Eliminating toil

Reading Time: 3 minutes Hello everyone, As we all know the meaning of “toil“. In this blog, we are going to see what exactly it is in SRE and how we are going to calculate it. In the daily routine of our organization, we need to do some work that we don’t like at all e.g paperwork, attend meetings and sending emails, and many more. We call them “toil” Continue Reading

Docker Networking(Bridge-Network)

Reading Time: 3 minutes In this blog we gone see how two containers can communicate with each other using the concept of Docker Networking. One of the most important thing that docker containers and services are so powerful is that you can connect them together. Let’s imagine you have two application one is front-end application and second one id back-end and you created two containers one for each. By Continue Reading

SRE: Service Reliability

Reading Time: 3 minutes Hi guys, In this blog, we will looking at what reliable service is and how can we bring reliability to our service. Reliability is one of the values which is hard to bring in our service. It is important to make sure everyone in our team knows what is the real meaning of these that will help them to bring proper reliability to our service. Continue Reading

The battle of DevOps and SRE

Reading Time: 3 minutes Are you a DevOps Engineer or a Site Reliability Engineer? Let us try to tear the question apart 🙂 DevOps is a set of practices that automates the processes between software development and IT teams. This is to enable them to build, test, and release software faster and more reliably. It builds a culture of collaboration between teams that historically functioned in relative silo’s. The Continue Reading