Tale of a Container’s file system

Namespace, CGroup, and Union file-system are the basic building blocks of a container. Let’s have our focus on file-system. Why yet another file-system for the container? Are Conventional Linux file-systems like ext2, ext3, ext4, XFS etc. not good enough to meet the purpose? In this blog post, I will try to answer these questions. Here we will be delving deeply into the Union File System and a few of its essential properties.


Layered architecture

A container is composed of multiple branches. In docker’s terminology, branches are also known as layers. A sandbox of a container is composed of one or more image layers and a container layer. Container layer is writable, image layers are read-only.

Layered File Structure of a Container


Identifying the Problem

Following are the 2 main challenges with conventional file-systems.

Inefficient Disk Space Utilization
Let’s take a hypothetical scenario. Suppose 10 instances of a docker container are up and running on your system. Image size is 1 G. If you use concrete file-systems like ext* or NFS for containers, at least 10 G of physical memory would be eaten up by containers. It is bad for disk space optimization.

Latency in bootstrap
A container is nothing but a process. In Linux, the only way to create a new process is forking the existing process. The fork operation creates a separate address space for the child. The child process has an exact copy of all the memory segments of the parent process. In order to create a new container, all the files of image layers would be copied into container namespace. A container is expected to start in a few milliseconds. If a huge payload is needed to be copied at the time of starting a container it increases the bootstrap time of a container.

So, here we need some mechanism to efficiently share physical memory segments among containers. In order to address these challenges listed above, Union Capable File Systems came into existence.


Union File System

Union file system works on top of the other file-systems. It gives a single coherent and unified view to files and directories of separate file-system. In other words, it mounts multiple directories to a single root. It is more of a mounting mechanism than a file system.

Union Capable File System

In the above figure, you can see that multiple directories on different file-systems are mounted on a common root. UnionFS, AUFS, OverlayFS are the few popular examples of the union file system.


Properties of a Union File System

we need a file-system service with following properties.

  1. Logical merge of multiple layers.
  2. Read-only lower layers, writable upper layer.
  3. Start reading from the upper layer than defaults to lower layers.
  4. Copy on Write (CoW)
  5. Simulate removal from lower directory through whiteout file.

In order to simplify the above properties, please substitute the term layer with directory. Here I will try to explain all the mentioned properties using a use case.

Union File System (OverlayFS) : The Use Case

Here I am going to simulate a container’s file-system layers using three directories named Frontend, Backend, and Fullstack. You can relate Frontend and backend directories with image or lower layers. Similarly, Fullstack is comparable with container or upper layer. Overlay or merge layer sits On top of all the directories and provides a logical, coherent and unified view of multiple physical directories to the application. Let’s explore all the properties using this use case.

Create sample directory structure and virtual partitions.


Experiment 1 : Mount multiple directories on a common mount point using ext* file-systems.

Conclusion: Multiple volumes can’t be mounted on a single mount point. concrete filesystems won’t help here.


Experiment 2: Mount multiple directories on a common mount point using Union File System (OverlayFS).

Conclusion: Yes, Union-Filesystem is capable to mount multiple directories of different file-systems on a mount point.


Experiment 3: Demonstrate Copy-on-write (CoR)

Copy-on-write is a similar strategy of sharing and copying, in which the system processes that need access to the same data share the same instance of that data rather than having their own copy. At some point, if any one process wants to modify or write to the data, only then does the operating system make a copy of the data for that process to use. Only the process that needs to write has access to the data copy. All other processes continue to use the original data.

File access through the OverlayFS retrieves data from the “upper” directory first, and then defaults to the “lower” directory. Here union mount tries to retrieve file “git” from frontend directory since file doesn’t exists in fullstack directory.

Modifications to files in the “upper” directory will take place as usual. Any modification to a file from the “lower” layer will create a copy in the upper layer, and that file will be the one modified. This leaves the base files untouched and available through direct access to the “lower” folder.


Experiment 4: Deleting files from lower” layer.

A file removed from the union mount directory would directly remove a file from the “upper” directory, and simulate that removal from the “lower” directory by creating what is called a “whiteout” file. This file exists only within the “union” directory, without physically appearing in either the “upper” (fullstack/upper) or “lower” (frontend and backend) directories. When the union mount is dismounted, this state information will be lost, so care should be taken to reflect any necessary changes to the “lower” directory.


Conclusion

Containers can grow very high in numbers so it is a good idea to leverage union capable filesystem. It allows a sensible way of sharing the data among containers. At the same time, it ensures the integrity of filesystem.


knoldus-advt-sticker

Written by 

Mayank is a polyglot programmer who believes in selecting the right tool for the job. He has more than 8-year experience in Java Platform. He has been a Scala enthusiast ever since he came to know this beautiful language in 2010. He has been developing enterprise applications on the reactive stack. He's a big fan of agile development, scalable software and elegant code. Mayank has extensive knowledge in a huge spectrum of areas of software, and the ability to dive deeply into a new technology and achieve expert level in no-time. He found fun to architect complex systems in the simplest way and quite handy in Design patterns, micro-services & DevOps technologies. On the personal front, he is a marathon runner, spiritual learner & yoga practitioner.

Leave a Reply

Knoldus Pune Careers - Hiring Freshers

Get a head start on your career at Knoldus. Join us!