ZFS initially stood for Zettabyte File system. It was developed by Matthew Ahrens and Jeff Bonwick as a part of Sun Microsystems in 2001 but later was placed under a closed license when Oracle Corporation acquired Sun Microsystems Inc. ZFS is well known for its storage pool and the features associated with it.
It is a multiprotocol enterprise storage system that is designed to enhance the application performance, simplify management, and increase the storage capacity of a system. ZFS uses a combination of standard enterprise-grade hardware as well as the storage-optimized operating system.
Features of ZFS
ZFS has many features such as:
- Pooled Storage
- Data Integrity
In this blog, we will discuss the pool storage for ZFS.
ZFS Storage pool
The layout of the ZFS storage pool has a huge impact on the performance of the system. It integrates the features of a file system and a volume manager. It takes the responsibility of choosing the right configuration for our system under various workloads and is important for an administrator to understand the mechanics of pool performance when designing a storage system.
There are several metrics that quantify the performance of the pool:
• Read I/O operations per second (IOPS)
• Write IOPS
• Streaming read speed
• Streaming write speed
• Storage space efficiency
• Fault tolerance
The storage pool of ZFS constitutes one or more virtual devices that are, in general, called vdevs. A Vdev is either a single disk, or two or more disks which mirrors each other, or a group of disks that organizes together. The RAID layout sets on each vdev as opposed to the storage pool. Similarly, data that is present in the storage pool strips across all vdevs which also means that the loss of one vdev would result in pool failure.
As we know, the data in the storage pool is comprised of the data present on several devices. The striped logical volume writes the data equally across all the physical volumes. While writing some data to a pooled storage of striped vdevs, the data breaks into small chunks called “blocks” and is distributed equally across the disks in the pool. This feature of the striped pool leads to efficient performance and management of storage space. However, it has zero fault tolerance, as if any disk on the system fails, the entire data in the pool will be destroyed.
A drive constituting the mirrored vdev stores the exact copy of all the data of each file. Initially, RAID-1 mirrors supported two drives to mirror the data. Later, ZFS allowed more drives to mirror to secure and protect the system from data loss. It reduced the fault tolerance and eliminates the problems with the striped vdevs. If all the devices of the mirrored vdev are of the same size, then the total storage space will be equal to the size of a single drive in vdev. However, if the size of the mirrored vdevs are different then the total storage space will be equal to the size of the smallest drive in the mirror.
Mirrored vdevs are also used for multiple reads that occur simultaneously but restricts multiple writes to maintain the data integrity.
Creating a pool with one vdev:
zpool create file1 mirror c0t0d0 c0t1d0
Creating a new pool with two vdev, that are RAIDZ groups with two data disk and one parity disk each:
zpool create file2 raidz c0t0d0 c0t1d0 c0t2d0 raidz c0t3d0 c0t4d0 c0t5d0
Adding more vdevs:
zpool add file1 mirror c0t2d0 c0t3d0
zpool add file2 raidz c0t6d0 c0t7d0 c0t8d0
The Z in RAID-Z defines the parity number. The value of z can be 1,2 or 3, depending upon the use of parity. RAID adds the parity information associated with the data, whenever a write operation is performed. This is done to protect the alterations of data and to ensure the integrity of data. In case a drive fails, we can retrieve the missing data with the help of parity information. It also eliminates the condition of the write hole phenomenon which was associated with earlier versions of RAID.
Putting VDEVs together
ZFS pools are created with many vdevs put together. This aggregation of vdevs enhances the capabilities of the file system regarding the storage and management of data.