I’m going to veer from my usual topics today to talk about Kubernetes (K8s) and vSphere given the new features in vSphere 7 (e.g. Project Pacific, Tanzu). K8s and Docker are everywhere these days – from many of VMware’s newer appliances like SRM to our own VSI plugin. Initially, containers generally were associated with being stateless, meaning nothing was saved in the running of them. Stateless containers can start, stop, restart, move to different nodes, etc. with ease since you don’t have to worry about storage. However, more and more we see stateful containers which need persistent storage. And if you are using vSphere, as opposed to bare metal, where better to keep that persistent storage than on a datastore. The vSphere paradigm fits nicely because you can run all your K8s nodes in a vCenter and they all have access to the same storage and can move freely among the cluster without the need to zone and provision to individual bare metal hosts. So I’m going to write a bit on how VMware came to integrate with this model and the latest developments that allow that persistent storage to be put on Virtual Volumes (vVols). I caution at the outset that I am no expert at this stuff. I can research well and follow directions (and then pull out my hair) and eventually I get there which is how this work came about. It also means that you can do it too (hair optional).
vSphere Storage for Kubernetes
VMware first introduced the idea of running K8s on vSphere back in 2017 when vSphere Cloud Provider was released (vSphere Storage for Kubernetes – not to be confused with the new vSphere for Kubernetes or Project Pacific). The idea behind it was to provide a way to allow K8s to access vSphere storage. I did a bunch of testing with this at the time looking at persistent storage (with the VCP plugin) on our arrays with a VMware initiative you might remember as Project Hatchway. Over the ensuing years VMware decided to productize this into VMware Cloud Native Storage (CNS). CNS is directly integrated into vCenter as of 6.7 U3, so fairly recent. Using CNS you can run stateful K8s workloads, i.e. by creating persistent storage on arrays like the PowerMax. Basically CNS allows vSphere to understand volumes with K8s.
CNS has two components, a vSphere volume driver for K8s, and the integration into vCenter shown here:
The volume driver is actually comprised of two parts, a syncer and the vSphere Container Storage Interface (CSI) driver. The latest CSI driver, 2.0, is the one that supports provisioning to Virtual Volumes. You can find the support feature matrix for the 2.0 CSI driver as well as the different implementations here. For reference, when we get to my example I will be using Vanilla K8s (left-hand column in the matrix), which is what it sounds like, a plain K8s installation with the integration of the cloud provider and CSI (aka CNS).
CNS controls all aspects of volume provisioning with the storage control plane. It handles the entire lifecycle – creation, deletion, etc., including snapshots, clones and general health. One concept that is essential to this model is that these volumes are independent of the lifecycle of VMs. We are accustomed to understanding vmdks as part of the VM. Delete the VM, the vmdk goes with it. But not with this model. As we are working with block storage on the PowerMax (I’m not covering file on vSAN which is also supported), there is a second concept called First Class Disks (FCD) that comes into play.
FCDs are a concept VMware came up with to allow volumes to exist without having to be part of a VM – meaning you can control the lifecycle outside of the VM. This is why CNS supports block volumes backed by these FCDs. If you want more detail, Cormac Hogan from VMware has a good intro here.
When provisioning with the CSI driver we are going to use Storage Based Policy Management (SPBM). SPBM is a great model for FCD because for vVols it allows us to assign any array capabilities we want the volume to have, independent of the VM it may be associated with. For instance, on the PowerMax we can set the Service Level (QoS) for the volume. In the future it also might include replication or a backup policy. Note that you can also use SPBM with tags, so that you might designate a single VMFS as the target.
OK enough writeup I think, let’s look at how to provision a FCD with the CSI driver into a PowerMax Virtual Volume datastore. As I mentioned, I have Vanilla K8s. I’m not going to cover installation as there are lots of different instructions out there to install K8s and the CPI/CSI components. I used these 2 sources primarily, though not exclusively, if you want a place to start:
My setup is simple enough, comprising only 3 nodes. Within vSphere you can see my VMs in the k8s folder. My master is dsib0211, and I have 2 nodes, dsib0212 and dsib0213.
From kubectl you can see both the nodes and all the components. I’ve highlighted the containers in the screenshot that include both the vSphere Cloud Provider (cloud controller) and the CSI driver (csi controller on master, and csi on each node including the master).
Now a quick look at my vVols setup on the vCenter. I’ve registered a VASA Provider and created a new vVol datastore CNS_vVol_Container which is currently empty.
My container has two Service Levels, Diamond and Bronze.
So we’re set on the PowerMax side, now we need a Storage Policy that we will call in the manifest (yaml file) from our K8s master. I’m going set the Service Level to Diamond in the policy for this example:
So now we are good to create the two manifests we need, the Storage Class manifest and the PVC manifest.
I am going to use Dynamic Volume Provisioning which basically means I am just going to ask to create a FCD. There are other provisioning options (e.g. wait for request) but this one serves my demo purpose. So my first file creates the StorageClass. The name variable is arbitrary (though it must be lower case and has character restrictions), but you will use it in the next file. The “provisioner” is the CSI driver, and finally I need to pass the Storage Policy name. Run this with kubectl create and hopefully get the same result I did.
apiVersion: storage.k8s.io/v1 kind: StorageClass metadata: name: powermax-sc provisioner: csi.vsphere.vmware.com parameters: storagepolicyname: "K8s_Diamond" storageclass.storage.k8s.io/powermax-sc created
Once the Storage Class is created, you can execute the manifest for the creation of the volume. The most important field in this file is the “storageClassName”. Be sure it matches your manifest for the Storage Class as mine does. I also added a “labels” field which will add a description we can see in vCenter. You can also see I am going to create a 1 GB volume.
apiVersion: v1 kind: PersistentVolumeClaim metadata: name: powermax-pvc labels: app: vVols_FCD spec: storageClassName: powermax-sc accessModes: - ReadWriteOnce resources: requests: storage: 1G persistentvolumeclaim/powermax-pvc created
I can do a describe of the process to watch the creation. Currently it is pending:
root@dsib0211:/etc/kubernetes# kubectl describe pvc powermax-pvc Name: powermax-pvc Namespace: default StorageClass: powermax-sc Status: Pending Volume: Labels: app=vVol_FCD Annotations: volume.beta.kubernetes.io/storage-provisioner: csi.vsphere.vmware.com Finalizers: [kubernetes.io/pvc-protection] Capacity: Access Modes: VolumeMode: Filesystem Mounted By: Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Provisioning 47s csi.vsphere.vmware.com_vsphere-csi-controller-577f5c7468-hl5zg_903d2c55-464f-4ec6-92c4-873ca88ea1f8 External provisioner is provisioning volume for claim "default/powermax-pvc" Normal ExternalProvisioning 8s (x4 over 47s) persistentvolume-controller waiting for a volume to be created, either by external provisioner "csi.vsphere.vmware.com" or manually created by system administrator
And when successful:
root@dsib0211:/etc/kubernetes# kubectl describe pvc powermax-pvc Name: powermax-pvc Namespace: default StorageClass: powermax-sc Status: Bound Volume: pvc-9b8b3bec-3ab3-4d0d-b131-53f0edc74c22 Labels: app=vVol_FCD Annotations: pv.kubernetes.io/bind-completed: yes pv.kubernetes.io/bound-by-controller: yes volume.beta.kubernetes.io/storage-provisioner: csi.vsphere.vmware.com Finalizers: [kubernetes.io/pvc-protection] Capacity: 954Mi Access Modes: RWO VolumeMode: Filesystem Mounted By: Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Provisioning 5m59s csi.vsphere.vmware.com_vsphere-csi-controller-577f5c7468-hl5zg_903d2c55-464f-4ec6-92c4-873ca88ea1f8 External provisioner is provisioning volume for claim "default/powermax-pvc" Normal ExternalProvisioning 4m35s (x7 over 5m59s) persistentvolume-controller waiting for a volume to be created, either by external provisioner "csi.vsphere.vmware.com" or manually created by system administrator Normal ProvisioningSucceeded 4m26s csi.vsphere.vmware.com_vsphere-csi-controller-577f5c7468-hl5zg_903d2c55-464f-4ec6-92c4-873ca88ea1f8 Successfully provisioned volume pvc-9b8b3bec-3ab3-4d0d-b131-53f0edc74c22
You can also see it in the vCenter tasks if you prefer:
So what objects are now in our previously empty vVol datastore? Well a couple things. First, there is a new directory, catalog. The catalog will contain the metadata location information of the FCDs. Makes sense. There is only one catalog folder for each vVol datastore so the next time we create a FCD, it only gets updated. The second folder is fcd. This is where, unsurprisingly, the FCD vVol is stored.
If we look into the fcd folder we can see that our persistent volume request created two files, the first is the FCD itself (vmdk) and the other is its metadata or in K8s parlance, the sidecar (vmfd).
On the array we have 4 vVol devices created for this initial FCD. The first one is for the catalog folder (00561), the second for the fcd folder (00562) and the other two are the FCD and sidecar respectively (00563 and 00564).
Finally, let’s take a look at the CNS integration in vCenter. If I navigate to the Monitor tab of the datastore, a different location from the screenshot above, my new FCD is listed, along with the label I provided for reference.
If you select the checkbox and the detail box next to the FCD you can get some more information about the volume.
Note that every subsequent persistent volume I create will generate only 2 new vVols, the FCD and the sidecar. The vVol that represents the catalog information will simply be updated. For example, adding a second 2 GB FCD will result in 00560 and 00565 being created:
Well I hope that gives you a start if you want to look at vSphere 7 with vVols, CNS and the Dell EMC PowerMax. In the future I may try to do some more work with this and perhaps go through the lifecycle of the FCD.