Around all the buzz words flying in the micro-service development area, one of the hottest is JWT (JSON Web Token). We often encounter JWT in already established services or may be asked to authenticate a User using a JWT. So if you’re new to this concept, hang in tight! In this blog, I’ll try to explain what exact use-case JWT fulfills and when you can use it.
JWT stands for JSON Web Token. It is a simple String object commonly used to authenticate a client. But why at all do we need this authentication?
HTTP – The stateless protocol
HTTP, as we all know, is a stateless protocol. Stateless simply would imply that the server retains no state of the user who is requesting a resource from it. It is oblivious to any previous details or interaction and each request is considered a first interaction. Now being stateless may have its own advantages, but it also has a few disadvantages.
Not being able to have a state makes it difficult to authenticate a user for each request. Imagine a web service like an HR management system. It gives special rights to Administrators, Financial Office-bearers, Team Leads and limits the scope of action for normal employees. So when a team lead opens the “Leaves” tab, they can see their own details and the details of their teammates. But a regular employee can see only their own details. How does a server ensure this kind of special access? By being able to identify who is requesting the resource and what rights do they have. In short, by user authentication.
The most common way for user authentication was by maintaining sessions. Each user would perform a login, which is when the server creates an entry in a log called session log and assign the user a session token. So whenever a request comes, the user is authenticated via that token and corresponding entry in the session logs. This is a great way to authenticate a user but doesn’t scale well, particularly in the case of a microservice. So this approach works fine for a monolith:
Now imagine if this were a microservice hosted over a multinode cluster. Each node has the application running and every node has its own cache. How does one node know if a request is coming from a user who’s session token exists on another node in the cluster? Do we keep syncing the cache every time a new user is authorized? Does this seem like a scalable solution?
Another way to go which in general is used is, having a common cache for the whole cluster. Each time a user is authorized, an entry is made into the common cache. This again might not be a very effective idea as now we have a single point of failure.
JWT or JSON Web Token
So, it may not always be a good idea to store user information over at the server for the purpose of authentication. This is where JWT offers us a different path. JSON Web Token as the name suggests is a plain BASE 64 String that can help us authenticate a user’s identity.
The whole idea is, every request comes with a token string in its header. This token is always in a standard format. We’ll discuss the format in detail later. So a general flow of requests is something like this:
- A user types in there credentials and requests to log in. This request contains no token, just (hopefully in a secure manner) the login credentials.
- Once the server authenticates it, it creates a token using some of the user’s information and adding other data that may act as the metadata to the token like token expiry time. This is then digitally signed using an agreed-upon cryptographic algorithm with a private key.
- For all subsequent requests that the client makes, this token becomes the part of the request in the header.
- Whenever the server receives this request, it will decrypt the BASE64 string and verify the signature with the private key. Once successful, it will use the provided user’s information as to their identifier and follow the normal procedure.
Hence all the multi-server problems suddenly disappear. Not saying JWT is a perfect solution but that it has a larger section of use cases it can cater to.
Structure of JWT
A JWT is a simple Base 64 encoded String. This Base 64 encryption isn’t about hiding anything. Plain JSON strings can have special characters that can be difficult to handle, hence using simple Base64 encryption goes a long way when it comes to ease of use.
The structure of a JWT generally has three parts separated by dots. Each of these parts represents something.
- Base 64 encrypted payload: The JSON payload that is used to identify the user. This payload shouldn’t have any confidential information like passwords and PINs. It just needs to have identifiers like user-ids, usernames or something of the sort. And it may contain information like the role of the user that determines their access scope to things. And it may contain some metadata of the token like expiry-time or issued-at-time.
- Base 64 encrypted header: This JSON payload generally has two fields. The algorithm used to digitally sign the token, and the type which has the value JWT.
- Signature: Using the aforementioned cryptographic algorithm, and the base64 versions of point1 and 2 values, and using the private secret key, we get a signature which helps in validating the aforementioned information.
So this can help you with the fundamentals and need of a JWT. But I know that this would not make much sense without some code. In part 2 of this series of blogs, we’re going to design a JWT producing application using Spring Security. So keep learning and stay tuned!