Hi everyone! In this article, we will talk about what is service mesh and why do we need it in a microservices application.
Simply speaking, service mesh is a popular solution to manage communication between microservices application. Now a simple question that comes into mind here is why we actually need a dedicated tool for microservices application. To understand this, let’s first discuss what are the challenges that we face in a microservices application.
Challenges of a microservices application
Let’s take an example of an online shopping application which is made up of several microservices. For example, a web server that gets the UI request, a payment microservice that handles the payment requests, a database that persists data and so on. We’re deploying our microservices in a Kubernetes cluster. Now, let’s discuss what do we need for this microservices set up or what are the configuration requirements.
– Business Logic
First of all, each microservice requires its own business logic. And obviously these services need to talk to each other. For example, when a user puts something in the shopping cart, the request goes to the web server. It hands it over to the respected microservice and then it goes to the database for persisting required data. So, how does these services know how to communicate with each other and what the endpoints of each services are.
– Service Endpoints
Second, all the service endpoints that web server talks to must be configured for the web server. So when we add a new microservice, we need to add the endpoint of that new service to all the microservices that need to talk to it. So, we have that information as part of the application deployment code. Another question, what about security?
Generally, a common environment in many projects have the firewall rules set up for the Kubernetes cluster. Or there can be a proxy as entry point that gets the requests first so that the cluster can be accessed directly. So, we have security around the cluster. However, once the request gets inside the cluster, the communication is insecure.
Microservices talk to each other over HTTP or some other insecure protocol. Every service inside the cluster can freely talk to any other service without any restrictions. This means that from security perspective, once an attacker gets inside the cluster, it can do anything because we do not have any additional security inside. And maybe that can be okay for small applications that don’t really have any sensitive user data. But for more important applications like online banks or apps where we have lots of user personal data, a higher level of security is very important. So, we want everything to be as secure as possible. For that, we need additional configuration inside each application to secure communication between services within the cluster.
Also, we need retry logic in each microservice to make the whole application more robust. If one microservice is unreachable, we would want to retry the connection so developers would add this retry logic also to the services.
Lastly, we would want to be able monitor how the services are performing. For example, what HTTP errors are you getting, how many requests is your microservice receiving or sending or how long does a request take to identify the bottlenecks in your application. So, development team may add a monitoring logic for example, prometheus.
In conclusion, the developers team of each microservice needs to add all this logic to each microservice. And maybe configure some additional stuff in the cluster to handle all these very important challenges. This means that the developers of microservices are not working on the actual service logic. But are busy adding network logic for metrics, security and communication etc with each microservice. This also adds complexity to the services instead of keeping them simple and light-weight.
Solution: Service Mesh
It would make much more sense to extract all the non-business logic out of the microservices into its own small sidecar application that handles all these logic and acts as a proxy. This small application is a third party application. The cluster operators can easily configure it through a simple API without worrying about the implementation of the logic. So, developers can now focus on the actual business logic. Also, we don’t have to add this sidecar configuration to our microservice deployment YAML file. Because service mesh has a control plane that will automatically inject this proxy into every microservice pod. So now the microservices can talk to each other through those proxies. And the network layer for service to service communication consisting of control plane and the proxies is a service mesh.
Traffic Split Configuration
In addition to the above features, one of the most important features of service mesh is traffic split configuration. When changes are made to a payment microservice for example, a new version is built, tested and deployed to the production environment. Now of course we can rely on tests to validate the new version. But what if the new version has a bug that we couldn’t catch with the tests. It happens very often depending on the tests coverage.
So in this case we don’t want to end up with a new version of payment service in production that doesn’t work. This can cause the company a lot of money. We would want to send maybe only 1% or 10% traffic to the new version for a period of time to make sure it really works. So with service mesh we can easily configure a web server microservice to direct 90% of traffic to the payment service version 2.0 and 10% of traffic to the version 3.0 which is also known as Canary deployment.
For more reference, please refer here.