The Container Storage Interface (CSI) is a specification that is implemented in Kubernetes. CSI is a standard for exposing arbitrary file and block storage systems to containerized workloads in Kubernetes (or similar container orchestrators).
There are a large number of CSI-drivers for various storage systems available1. To make this article actionable I will concentrate on a single CSI-diver and go through a full example of it. How to install each CSI-driver into your cluster, and how to use it, is different for different clusters and drivers. So it is important to read the documentation for the CSI-driver you want to use.
In the rest of this article I will go beyond the local Minikube cluster I have been using so far in this series. Instead I will set up an Elastic Kubernetes Service (EKS) cluster on Amazon Web Services (AWS)2. A real cluster! We are finally getting somewhere!
Working with a CSI-driver#
The general workflow when working with a CSI-driver is:
- Prepare your cluster
- Install the CSI-driver into your cluster
- Create PersistentVolumes
- Create PersistentVolumeClaims
What exactly the first two steps in this list include depends on the CSI-driver you select. In the example I will show the first two steps are very easy and we will hardly notice them.
Create an AWS EKS Kubernetes cluster#
Warning, if you follow the steps outlined below you will be charged a small amount of money! How much money depends on how long you keep the cluster running. You will pay for both the EKS-cluster with associated EC2-instances, and the EBS-volume that will be the backing media for the PersistentVolume we create.
This article is not a tutorial on how to work with AWS. To follow along the steps in this article you must have an AWS account. You will also need to install and configure the AWS CLI. The required steps to do this is documented in the official AWS documentation: installation and configuration.
Install prerequisites#
AWS, together with Weaveworks, have created a tool called eksctl
that simplifies the creation of Kubernetes clusters (EKS) on AWS. This tool can be install (on mac) with Homebrew:
$ brew tap weaveworks/tap
$ brew install weaveworks/tap/eksctl
The first command (brew tap weaveworks/tap
) installs the Weaveworks Brew repository. The second command (brew install weaveworks/tap/eksctl
) installs the eksctl
CLI tool from the Weaveworks repository. In the following sections I will use eksctl
for various tasks, the full documentation for this tool can be found at eksctl.io.
Creating the cluster#
eksctl
allows me to declaratively define what type of Kubernetes cluster I want to create using a custom YAML schema that looks similar to a regular Kubernetes manifest. In eksctl
terms this file is called a config file. This is the config file for my Kubernetes cluster:
# cluster.yaml
apiVersion: eksctl.io/v1alpha5
kind: ClusterConfig
metadata:
name: cluster
region: eu-north-1
iam:
withOIDC: true
serviceAccounts:
- metadata:
name: ebs-csi-controller-sa
namespace: kube-system
wellKnownPolicies:
ebsCSIController: true
nodeGroups:
- name: nodegroup-1
instanceType: m5.large
desiredCapacity: 3
iam:
withAddonPolicies:
ebs: true
In cluster.yaml
I define settings for my cluster named cluster
(not very original). I say that I want my cluster to be created in the Swedish region (eu-north-1
). I add a single node group to my cluster. A node group is a collection of nodes (virtual machines) with a given specification. I configure my cluster to have the necessary Service Accounts3 and IAM-policies needed to work with the EBS CSI-driver. The details of this are beyond the scope of this article.
I create the cluster from my cluster config file with the following command:
$ eksctl create cluster -f cluster.yaml
Creating an EKS-cluster is time-consuming, so you will need to wait 15-20 minutes for this operation to complete.
Authenticate to the cluster#
Now I have a Kubernetes cluster in the form of EKS.
To be able to communicate with my EKS-cluster I need to configure a kubeconfig file. We have not discussed kubeconfig files so far in this Kubernetes-101 series, and we will not really do so in this article either.
For now, just know that to be able to communicate with a given cluster, we need some form of credentials. These credentials are stored in files referred to as kubeconfig files, we will go through them in more details in a future article.
As it turns out, eksctl
automatically sets up a kubeconfig
for us in /Users/<username>/.kube/config
(on mac) and activates it for us, so we can immediately start working with our cluster.
Installing the CSI-driver#
After creating my cluster with eksctl
I have completed the first step of my four-step guide to CSI-drivers.
I need to install my CSI-driver into my cluster. There are instructions available at the EBS CSI-driver GitHub page. The essential step in the instructions is that I should run kubectl apply
with the manifests that they provide:
$ kubectl apply \
-k "github.com/kubernetes-sigs/aws-ebs-csi-driver/deploy/kubernetes/overlays/stable/?ref=release-1.14"
To make sure that the CSI-driver has been installed I list the running Pods in the kube-system
Namespace:
$ kubectl get pods -n kube-system
NAME READY STATUS RESTARTS AGE
aws-node-675g8 1/1 Running 0 20m
aws-node-f5dg8 1/1 Running 0 20m
aws-node-k46g7 1/1 Running 0 20m
coredns-d5b9bfc4-8dfzb 1/1 Running 0 32m
coredns-d5b9bfc4-bjqcj 1/1 Running 0 32m
ebs-csi-controller-988ff97c9-lwlhf 6/6 Running 0 1m
ebs-csi-controller-988ff97c9-n72mp 6/6 Running 0 1m
ebs-csi-node-bsvl9 3/3 Running 0 1m
ebs-csi-node-sqlvc 3/3 Running 0 1m
ebs-csi-node-ssqtf 3/3 Running 0 1m
kube-proxy-2n4rq 1/1 Running 0 20m
kube-proxy-jmxmd 1/1 Running 0 20m
kube-proxy-klhvf 1/1 Running 0 20m
I can see two ebs-csi-controller-...
Pods and three ebs-csi-node-...
Pods. That completes step two of my four-step guide.
Using the CSI-driver#
Now I want to use the CSI-driver.
In the previous article on PersistentVolumes and PersistentVolumeClaims we learned that as Kubernetes administrators the first step is to create a PersistentVolume that our Kubernetes application developers later can claim through a PersistentVolumeClaim. In this section I will complete steps three and four in the four-step guide to CSI-drivers!
For our Kubernetes administrators to be able to use the CSI-driver for EBS-volumes I must first create an EBS-volume. For this I use the AWS CLI aws ec2 create-volume
command:
$ aws ec2 create-volume \
--availability-zone eu-north-1a \
--volume-type gp2 \
--size 100 \
--query VolumeId \
--output text
vol-097ab0b089cd15bdf
I have created a general purpuse (gp2
) type of EBS-volume of size 100
Gibibytes in the eu-north-1a
availability zone of the Stockholm (eu-north-1
) region. Next I can use the EBS-volume to create a PersistentVolume in my EKS-cluster. The manifest for this looks like this:
# pv.yaml
apiVersion: v1
kind: PersistentVolume
metadata:
name: ebs-pv
spec:
accessModes:
- ReadWriteOnce
capacity:
storage: 5Gi
csi:
driver: ebs.csi.aws.com
fsType: ext4
# provide the VolumeId for the EBS-volume
volumeHandle: vol-097ab0b089cd15bdf
nodeAffinity:
required:
nodeSelectorTerms:
- matchExpressions:
- key: topology.kubernetes.io/zone
operator: In
values:
- eu-north-1a
An EBS-volume is located in a single availability-zone4, and any workload that wishes to use it should be placed on a node that is located in the same availability-zone. In a production environment you would create a lot more than a single EBS-volume, and you would place them in different availability-zones5. In the manifest above I specify in .spec.nodeAffinity
that this PersistentVolume should only be used for nodes in the eu-north-1a
availability zone, because this is where my EBS-volume is located.
I create my PersistentVolume using kubectl apply
:
$ kubectl apply -f pv.yaml
persistentvolume/ebs-pv created
To make sure my PersistentVolume was created I run kubectl get pv
:
$ kubectl get pv
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
ebs-pv 5Gi RWO Retain Available 30s
I will use this PersistentVolume from a simple application. I create a composite manifest for a Pod and a PersistentVolumeClaim. The PersistentVolumeClaim will ask for 5Gi
storage. The manifest for this application looks like this:
# application.yaml
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: ebs-claim
spec:
storageClassName: ""
volumeName: ebs-pv # match the name of the PV
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 5Gi
---
apiVersion: v1
kind: Pod
metadata:
name: app
spec:
containers:
- name: app
image: centos
command: ["/bin/sh"]
args:
["-c", "while true; do echo $(date -u) >> /data/out.txt; sleep 5; done"]
volumeMounts:
- name: persistent-storage
mountPath: /data
volumes:
- name: persistent-storage
persistentVolumeClaim:
claimName: ebs-claim
In .spec.containers[*].args
I have created an infinite loop that appends the current date and time to a file with the path /data/out.txt
and then sleeps for five seconds. The EBS-volume is mounted at the /data
path.
I create my application using kubectl apply
:
$ kubectl apply -f application.yaml
persistentvolumeclaim/ebs-claim created
pod/app created
I verify that my Pod starts up using kubectl get pods
:
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
app 1/1 Running 0 1m12s
I can once again check what my PersistentVolume looks like, and this time I can see that it has been claimed:
$ kubectl get persistentvolumes
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
ebs-pv 5Gi RWO Retain Bound default/my-claim 3m48s
I can also verify that I have data in my Volume by watching how new data is appended to my output file:
$ kubectl exec -it app -- tail -f /data/out.txt
Thu Dec 29 06:33:09 UTC 2022
Thu Dec 29 06:33:14 UTC 2022
Thu Dec 29 06:33:19 UTC 2022
Thu Dec 29 06:33:24 UTC 2022
Thu Dec 29 06:33:29 UTC 2022
Thu Dec 29 06:33:34 UTC 2022
Thu Dec 29 06:33:39 UTC 2022
Thu Dec 29 06:33:44 UTC 2022
Thu Dec 29 06:33:49 UTC 2022
Thu Dec 29 06:33:54 UTC 2022
...
That concludes this exercise! It was a lot of work to get to this point, but once we are here it is easy to use the EBS CSI-driver for all our Volume needs.
Deleting my cluster#
Now that I am done with my test I can remove my cluster and all the workloads that are currently running on it using eksctl delete cluster
:
$ eksctl delete cluster -f cluster.yaml
This process takes a few minutes.
I should also remove the EBS-volume I created, for this I use the AWS CLI:
$ aws ec2 delete-volume --volume-id vol-097ab0b089cd15bdf
Summary#
What a ride! In this article we looked at CSI-drivers. CSI-drivers are a way for Kubernetes to work with a large number of storage backends using a common interface. I specifically showed you a CSI-driver for EBS-volumes in AWS. To this end I set up an EKS-cluster in AWS with a tool called eksctl
.
In the next article in this series we will take a look at Jobs and CronJobs. These are workload resources that creates Pods to perform a single task once, or repeating a task at a fixed interval.
See the list available at https://kubernetes-csi.github.io/docs/drivers.html, note that this is not an exhaustive list! ↩︎
I won’t be able to cover how EKS works in this article, but if you are interested to learn more you could visit www.eksworkshop.com. ↩︎
We have not covered Service Accounts yet but we will do so in a future article. ↩︎
An AWS availability zone is a data-center located in an AWS region. A region is usually made up of three or more availability zones that are located close to each other, but not close enough that a natural disaster would take out more than one data-center (availability zone). ↩︎
You could also utilize dynamic provisioning of PersistentVolumes and have the backing media automatically created for you, but we will not do that in this article. ↩︎