In the previous article we spent some time learning about the basics of Volumes. We saw how to create Volumes backed by a directory on our host machine (in my case the host machine is the Minikube VM), as well as from a ConfigMap in our cluster.
In this article we will look at PersistentVolumes and PersistentVolumeClaims. We saw how a Volume is part of the Pod specification, and it is not a separate Kubernetes object. Although the data in our Volume survived when we deleted our Pod, the Volume itself was deleted along with the Pod. The PersistentVolume is a separate Kubernetes object and its lifecycle is completely separated from the Pod that is using it. This is the kind of storage resource you want to use for your databases or for any other data that should persist well beyond the Pod resources that are using the data. The PersistentStorageClaim is a separate Kubernetes resource that is used to request storage with certain properties and it is the resource you connect with your Pods.
The following image illustrates the relationship between a Pod, a PersistentVolumeClaim, and a PersistentVolume.
PersistentVolume#
A PersistentVolume is a cluster resources, similar to how a node (a baremetal or virtual machine) is a cluster resource. Typically your cluster administrator will create PersistentVolumes for you, and you as a humble Kubernetes application developer would consume the PersistentVolumes in your application (through a PersistentVolumeClaim, more on that in the next section). A cluster resource does not exist in a specific Namespace.
Like a regular Volume, a PersistentVolume will be backed by a storage media of some sort, since the data has to be stored somewhere. This would typically be the storage media provided by a cloud provider, e.g. Elastic Block Store (EBS) on AWS.
I said that your cluster administrator might create PersistentVolumes for you to use, this is called static provisioning. There is also dynamic provisioning where PersistentVolumes are created on-the-fly as they are requested by an application. In this article we will only consider the static approach.
To create a PersistentVolume with a declarative approach we can use the following manifest1:
# pv.yaml
apiVersion: v1
kind: PersistentVolume
metadata:
name: database-pv
spec:
capacity:
storage: 1Gi
accessModes:
- ReadWriteOnce
storageClassName: local-storage
persistentVolumeReclaimPolicy: Retain
hostPath:
path: "/data"
The manifest in pv.yaml
contains the usual fields of .apiVersion
, .kind
, .metadata.name
and .spec
. In the specification I say the folllowing:
- I want to create a PersistentVolume of size
1Gi
(1 Gibibyte) - I want the access mode to be
ReadWriteOnce
which means that only a single node in my cluster can read and write to this PersistentVolume - I provide a
storageClassName
and set it to a custom name oflocal-storage
- I say that when a PersistentVolumeClaim associated with this PersistentStorage is deleted I want to retain the data (
persistentVolumeReclaimPolicy: Retain
) - Similar to what I did in the previous article on Volumes I use a
hostPath
type of Volume because that is the easiest to use from Minikube (the current cluster I am using). See the documentation for explanations of all settings1.
We can create our PersistentVolume using kubectl apply
:
$ kubectl apply -f pv.yaml
persistentvolume/database-pv created
We can run kubectl get
to see details about our PersistentVolume:
$ kubectl get persistentvolume database-pv
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
database-pv 1Gi RWO Retain Available local-storage 10s
In this output RWO
is short for ReadWriteOnce
. The column named CLAIM
is empty, this means this PersistentVolume has not been associated with any PersistentVolumeClaim (as I said, more on this in the next section). We will re-run this command later in this article.
We can run kubectl describe
to get additional details about or PersistentVolume:
$ kubectl describe persistentvolume database-pv
Name: database-pv
Labels: <none>
Annotations: <none>
Finalizers: [kubernetes.io/pv-protection]
StorageClass:
Status: Available
Claim:
Reclaim Policy: Retain
Access Modes: RWO
VolumeMode: Filesystem
Capacity: 1Gi
Node Affinity: <none>
Message:
Source:
Type: HostPath (bare host directory volume)
Path: /data
HostPathType:
Events: <none>
If we want to shorten the previous two commands we can replace persistentvolume
by its short form pv
, leading to kubectl get pv ...
and kubectl describe pv ...
. Now the short-hand version really pays off!
Before we can use our PersistentVolume we will need to create a PersistentVolumeClaim, more on this in the next section.
PersistentVolumeClaims#
Now we have to imagine that we are Kubernetes application developers and our Kubernetes cluster administrator has created a number of PersistentVolumes for us to use. Luckily we created a PersistentVolume ourselves in the previous section, so it is not hard to imagine we have one available. What we need to do now is to claim some storage from the pool of available PersistentVolumes. We do this through the resource known as a PersistentVolumeClaim. In a PersistentVolumeClaim we specify the properties of the storage we need, and the corresponding claim will be reserved from the pool of available PersistentVolumes.
Let’s create a PersistentVolumeClaim manifest:
# pvc.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: database-pvc
spec:
accessModes:
- ReadWriteOnce
storageClassName: local-storage
resources:
requests:
storage: "1Gi"
Here I say that I request storage space of 1Gi
with access mode ReadWriteOnce
, and that the storage class should be local-storage
. This fits with the PersistentVolume we created in the previous section! The next step is to write a manifest for a Pod that uses our PersistentVolumeClaim:
# pod.yaml
apiVersion: v1
kind: Pod
metadata:
name: database-app
spec:
# define the volumes available in this pod
volumes:
- name: database-storage
persistentVolumeClaim:
claimName: database-pvc
containers:
- name: db
image: alpine
command: ["sleep", "3600"] # hack to make the container stay alive
# mount the volume defined above
volumeMounts:
- name: database-storage
mountPath: "/mnt/data"
The important part of this manifest is .spec.volumes[*].persistentVolumeClaim
where I specify the name of a PersistentVolumeClaim resource. The rest of the manifest is similar to how we did it for regular Volumes in the previous article.
I place both my manifests into a directory named pvc
and I run kubectl apply
on the whole directory:
$ kubectl apply -f ./pvc
pod/database-app created
persistentvolumeclaim/database-pvc created
If I run kubectl describe pod
on my new Pod I can see that it has an attached Volume from a PersistentVolumeClaim:
$ kubectl describe pod database-app
Name: database-app
Namespace: default
...
Containers:
db:
...
Mounts:
/mnt/data from database-storage (rw)
Volumes:
database-storage:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: database-pvc
ReadOnly: false
...
Similarly if I check the details of my PersistentVolumeClaim:
$ kubectl describe persistentvolumeclaim database-pvc
Name: database-pvc
Namespace: default
StorageClass: standard
Status: Bound
Volume: pvc-be36a324-694a-4f0f-9bde-ca404a6e08ca
Capacity: 512m
Access Modes: RWO
VolumeMode: Filesystem
Used By: database-app
I see that the PersistentVolumeClaim is used by the database-app
Pod. The previous command was very long due to persistentvolumeclaim
being the longest resource type name in Kubernetes, luckily it can be shortened to pvc
, so the previous command could be changed to kubectl describe pvc database-pvc
. Did I say that we really saved time on writing pv
instead of persistentvolume
? Now we took this time-saving to an even higher level.
In the previous section I said that we would check on our PersistentVolume again once we have done something with it, so let’s run kubectl get
to see details about our PersistentVolume again:
$ kubectl get persistentvolume database-pv
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
database-pv 1Gi RWO Retain Bound default/database-pvc local-storage 7m15s
We can now see that the CLAIM
column has a value of default/database-pvc
, the name of our PersistentVolumeClaim. We also see that the STATUS
is now Bound
.
Now we have a Pod with a connected Volume that in turn uses the PersistentVolumeClaim we created together with the Pod, which in turn uses the PersistentVolume that we (or our Kubernetes cluster administrator) set up. We can use the Volume in our Pod just like we did in the previous article.
Summary#
In this article we encountered PersistentVolumes and PersistentVolumeClaims. The main point of these resources is to separate the lifecycle of our data stored in volumes with the Pods themselves. It is a good practice!
Next article is the last in this mini-series of articles on Volumes. We will learn about Container Storage Interface (CSI) drivers.
I go through the properties that I set in my manifest, but if you wish to see all available values and properties you can read the official Kubernetes documentation on PersistentVolumes at https://kubernetes.io/docs/concepts/storage/persistent-volumes/ or see the API specification at https://kubernetes.io/docs/reference/kubernetes-api/config-and-storage-resources/persistent-volume-v1/ ↩︎ ↩︎