Persistent Volumes and Persistent Volume Claims in Kubernetes - I

Table of contents

Reading Time: 5 minutes

Kubernetes has a PersistentVolume subsystem that abstracts details of how storage is provisioned from how it is consumed. As a developer of the pod you wouldn’t want to worry about the actual network storage infrastructure available in the cluster, would you? Anything pertaining to the infrastructure should be the sole responsibility of the cluster administrator. If a developer requires 50 Gi of space, he should get that space. Period.

To facilitate this abstraction, Kubernetes provides two resources: PersistentVolume and PersistentVolumeClaim.

TL;DR

PersistentVolume(PV): The cluster administrator sets up an underlying storage and then registers it in Kubernetes using the PersistentVolume resource. It is a resource in the cluster just like a Pod or a Deployment is. While regular volumes may have life dependent on the Pod or the host. PVs have a lifecycle independent of any individual Pod or a host. Also, PersistentVolumes don’t belong to any namespace. Their existence span the whole cluster.

PersistentVolumeClaim(PVs): When a developer wants to use persistent storage in one of their pods, they do so by requesting for it in the PersistentVolumeClaim(PVC) manifest and submitting it to the ApiServer. Kubernetes will find an appropriate PersistentVolume and will then bind it to the the claim which the user can use.

Lifecycle of a volume

Like I said, a PVs’ existence span throughout the cluster. Configuring the storage is known as Provisioning and there are two ways to provision a PV: statically and dynamically

Static Provisioning of a PV

Static Provisioning is when a cluster administrator has to manually create one or more PVs. It involves the manual creation of a manifest file and letting it through the `kubectl create -f` command.

Dynamic Provisioning of a Persistent Volume

Automation is good. Almost breathtaking! PVs can be dynamically provisioned, meaning the cluster may try to dynamically provision a volume specially for a claim. This provisioning is done by specifying another one of the Kubernetes object called StorageClass. The administrator just needs to configure a storage class and Kubernetes will take care of the rest. For every claim request by the user, kubernetes will try to provision a Volume according to a configured storage class.

Persistent Volume

Now you have the gist of what a PV is, let’s put our administrator hat on and configure a PersistentVolume. Like any other Kubernetes resource, A PersistentVolume has 4 primary fields: apiVersion, kind, metadata, spec.

Example 1: A HostPath Persistent-Volume(If you’re trying it on Minikube)

apiVersion: v1
 
kind: PersistentVolume
 
metadata:
  name: test-persistent-volume
 
spec:
  capacity:
    storage: 50Gi
  accessModes:
    - ReadWriteOnce
    - ReadOnlyMany
  persistentVolumeReclaimPolicy: Delete
  hostPath:
    path: /home/shubham/data

Let’s start from the beginning of the manifest.

apiVersion: PersistentVolumes have been around for quite some time. They are defined in version 1(v1) of the API.
kind: The kind of the resource comes here. PersistentVolume, in this case.
metadata: Object metadata goes here
spec: defines specification of a volume owned by a cluster.

PersistentVolume in action!!

Now to create a PV object, use the CLI command kubectl create -f persisten-volume.yaml. After validating the manifest file Kubernetes will create a new PV object. List out the PV object using the CLI command kubectl get pv

After a PV object is created, it gets a status field which specifies the state that Volume is in. It can be one of the following phases:

Available — The volume is not yet bound to any claim.
Bound — The volume is bound to a claim.
Released — The claim has been deleted, but the resource if not yet reclaimed by the cluster. The cluster administrator may decide to manually remove the data and the object.
Failed — The volume has failed its automatic reclaimation

And if you’d like to get more information about the PV object that was just created, use kubectl describe pv cassandra-pv command. It would list out all the information about it.

Describing the Persistent Volume object — Describing the PersistentVolume object

If you decide that you no longer want to use this Volume, delete it by using the kubectl delete pv cassandra-pv command.

The PersistentVolume Spec

When creating a PV, the administrator needs to tell Kubernetes what its capacity and its access modes. They also need to tell what its reclaim policy is(What to do when the PV is released) and the configuration of the underlying storage.

Reclaim Policy

A Persistent Volume’s reclaim policy specifies the action that is to take place when the PV is released from its claim. When a user is done using the Volume, the corresponding PVC object can be deleted by them. Currently, PersistentVolume have the following three reclaim policies:

Retain: When the claim for a PV is deleted, the PV object continuous to exist but its status goes from bound to released. But it’s still not available for use by another claim. The administrator will have to manually reclaim the volume by deleting the Volume and scrubbing the data left behind by the Pod using that Volume.
Recycle: This policy deletes the volume’s contents and makes the volume available to be claimed again. This policy performs a basic scrub on the volume and makes it available to use again. *This policy is now deprecated*
Delete: This policy deletes the underlying storage. Delete removes both the PV object and the data left behind by the Pod. For volumes that were dynamically configured using the k8s.io/minikube-hostpath provisioner, this policy is their default reclaim policy.

Access Mode

Each PV gets its own set of access modes. They define the mounting behaviour. These access modes are:

ReadWriteOnce(RWO): Only a single node can mount the volume for reading and writing.
ReadOnlyMany(ROX): The volume can be mounted by multiple nodes for reading.
ReadWriteMany(RWX): Multiple nodes can mount the volume for reading and writing.

A volume can be mounted using one access mode at a time.

Conclusion

PersistentVolumes make it easy to obtain persistent storage without the developer having to deal with the actual storage technology used underneath. This indirect method of creating storage and making it ready to be used is much simpler than having the developer configure and use the storage. But, PersistentVolumes alone don’t amount to much. Persistent Volumes are meant to be claimed by another Kubernetes object before they can actually be used. This object is the PersistentVolumeClaim. In the next blog, I’d write about how existing Persistent Volumes can be used by creating a PersistentVolumeClaim object and having the Pod developer use that claim. Thank you for sticking till the end.

Until next time!🍺