In this post I’ll cover both the vSphere CSI and the PowerFlex CSI. Since both are utilized in a VMware environment, let’s start there.
vSphere and Kubernetes
VMware has made Kubernetes or K8s, an integral part of their vSphere solution. Whether as part of vSphere in the form of Tanzu (with or without VCF), or simply as a vanilla implementation on any vSphere cluster, K8s seems to be everywhere. How prevalent or common it is in production environments, is a matter for the analysts, but VMware’s roadmap is rife with it so best to understand how it can be used with your storage.
As I am sure you are aware as a vSphere user, VMware offers its own software defined storage solution known as vSAN; however most customers have other storage solutions serving file and block to their environments. VMware had to find a way to allow customers to consume both kinds of storage, while providing automation or lifecycle management. From various projects was born CNS, or Cloud Native Storage. VMware offers this helpful infographic which shows the CNS components. Essentially VMware uses a CSI, or Container Storage Interface, to provision storage from various sources. A CSI is an implementation of the CSI specification which dictates how to connect storage to the orchestration tools in containers.
The vSphere CSI utilizes Storage Policy Based Management (SPBM) or storage policies via storage classes, to create persistent storage for K8s. As you can see in the diagram, that storage could be anything from vSAN to vVols, VMFS to NFS. Despite the options, VMware’s stated direction for using external storage in VMware solutions like K8s, is vVols. Now I can guess what many are thinking, that VMFS is, well, 95 or more percent of the current market. True enough. But from VMware’s perspective, vVols completely abstracts the vendor while providing a one-to-one relationship of vmdk to storage device. Since all storage vendors that support vVols adhere to the same specification for VASA, VMware can be sure the commands issued against a registered provider will be carried out in the same many regardless of the underlying storage system. That uniformity is ideal in solutions like Tanzu. Once vVols is configured, you wont need to map new volumes and create new datastores (though certainly you could add more containers if you wish). Whether customers agree with the direction, I don’t think anyone knows yet, but is where VMware is going.
First Class Disks
Volumes created with vSphere CSI are known as First Class Disks, or FCDs. What makes FCDs special is that unlike most disks in vSphere, these disks do not have to be associated with a VM. When you think of a vmdk in vCenter, you do so based on the VM to which it is attached. Through editing the VM you can expand the disk, delete it, or even temporarily remove it and add it to another VM; however its life is associated with another object. This is not the case with FCDs. A First Class Disk has its own lifecycle which can be controlled outside of the VM. You can expand it, snapshot it, or delete it without it ever being tied to a VM.
Cloud Native Storage
Although VMware presents CNS as all those components in the image, it really has two parts, a vSphere volume driver for K8s (vSphere CSI), and the integration into vCenter shown here:
The volume driver is comprised of two pieces, a syncer and the vSphere Container Storage Interface (CSI) driver. The latest CSI driver, 2.x, supports provisioning vVols. The interplay between K8s, vSphere, vSphere CSI, and if you have it, a vendor CSI like PowerFlex, can be complicated. You must follow a compatibility matrix to ensure you run the correct vSphere CSI and PowerFlex CSI. In my environment I use Kubernetes 1.21 and the latest vSphere 7 to accommodate my PowerFlex CSI. This tied me to the vSphere CSI 2.5.x version, even though 2.7 is the latest release. Each vSphere CSI has a minimum and maximum K8s version. You can find the compatibility matrix for the 2.x CSI driver here.
CNS controls all aspects of volume provisioning with the storage control plane. It handles the entire lifecycle – creation, deletion, etc., including snapshots, clones and general health. One concept that is essential to this model is that these volumes are independent of the lifecycle of VMs. This is only possible with the aforementioned FCDs.
I want to go through two examples in the following sections. The first is provisioning with the vSphere CSI using Virtual Volumes on the PowerFlex. The second is provisioning with the PowerFlex CSI which completely removes VMware from the equation.
vSphere CSI and CNS
Let’s look at how to provision a FCD with the CSI driver into a PowerFlex Virtual Volume datastore. The way we are going to do this is:
- Create a storage policy in VMware that is associated with our vVol datastore.
- Create a storage class that references the storage policy.
- Create a pvc that calls the storage class.
I have a vanilla K8s. It’s a very simple setup, a single cluster with three nodes (master 0211 and two workers 0212,0213).
Below I’ve listed all the running pods and highlighted the vSphere CI containers in black and the PowerFlex CSI in red since I’ll cover that next. The pending state for the vSphere CSI controllers on the two worker nodes is expected.
Now a quick look at my vVols setup on the vCenter. I’ve registered the PowerFlex VASA Provider (I have a single node configuration) and created a new vVol datastore vVol_PowerFlex which is currently empty save for the HA file (as I’m running vSphere HA).
When creating storage policies, PowerFlex offers the following profiles you can utilize:
You’ll see them when choosing the PowerFlex provider (com.emc.scaleio.vasa) in VMware:
My environment can only satisfy the Gold Tier so my policy reflects that.
So now we are good to create the two manifests we need, the Storage Class manifest (YAML) and the PVC manifest.
I am going to use Dynamic Volume Provisioning which basically means I am just going to create a FCD. There are other provisioning options (e.g., wait until requested) but this one serves my demo purpose. So my first file creates the StorageClass. The name variable is arbitrary (though it must be lower case and has character restrictions), but you will use it in the next file. The provisioner is the CSI driver, and finally I need to pass the Storage Policy name. Be sure the policy name is correct. Be aware the formatting below may not translate well with copy/paste.
apiVersion: storage.k8s.io/v1 kind: StorageClass metadata: name: powerflex-vvol-sc provisioner: csi.vsphere.vmware.com parameters: storagepolicyname: "Gold-PowerFlex" root@dsib0211:~/csi-scripts# kubectl create -f vvol-pflex-sc.yaml storageclass.storage.k8s.io/powerflex-vvol-sc created
Once the Storage Class is created, you can execute the manifest for the creation of the volume. The most important field in this file is the storageClassName. Be sure it matches your manifest for the Storage Class as mine does. I also added a labels field which will add a description we can see in vCenter. The pvc will be a 5 GB volume.
apiVersion: v1 kind: PersistentVolumeClaim metadata: name: powerflex-vvol-pvc labels: app: vVols-FCD spec: storageClassName: powerflex-vvol-sc accessModes: - ReadWriteOnce resources: requests: storage: 5G
I can do a describe of the process and see the creation happen almost immediately:
So what objects are now in our previously empty vVol datastore? Well a couple things. First, there is a new directory, catalog. The catalog will contain the metadata location information of the FCDs. Makes sense. There is only one catalog folder for each vVol datastore so the next time we create a FCD, it only gets updated. The second folder is fcd. This is where, unsurprisingly, the FCD vVol is stored.
If we look into the fcd folder we can see that our persistent volume request created two files, the first is the FCD itself (vmdk) and the other is its metadata or in K8s parlance, the sidecar (vmfd).
On the PowerFlex we have 4 vVol devices created for this initial FCD. The two 4 GB volumes are for the catalog folder and the fcd folder. The other two are the FCD (~5GB) and sidecar (~1 MB).
Finally, let’s take a look at the CNS integration in vCenter. If I navigate to the Monitor tab of the datastore my new FCD is listed, along with the label I provided for reference.
If you click on the detail box next to the FCD you can get some more information about the volume.
Note that every subsequent persistent volume I create will generate only 2 new vVols, the FCD and the sidecar. The vVol that represents the catalog information will simply be updated. For example I’ve highlighted a second 3 GB vVol and the two volumes on the PowerFlex.
So with vSphere CSI covered, I want to quickly mention the vendor CSIs and demonstrate creating a PVC with it.
Storage Vendor CSIs
While the vSphere CSI doesn’t know or care about the underlying storage, vendor CSIs are designed to provision only from their own storage. By design, these CSIs are not part of VMware’s CNS; however, they can co-exist in the same K8s environment. With the vSphere CSI it is not possible to provision a device directly from the PowerFlex array to your cloud native application. Any vSphere CSI provisioned device from PowerFlex is either going to be a virtual volume as shown, or a vmdk in a VMFS or NFS datastore. If you want to map a device directly to the OS and make it available to Kubernetes, you need to use the PowerFlex CSI. So let me show you what that looks like.
We are going to create and run the same objects we did for the vSphere CSI – a storage class and pvc. For the storage class with the PowerFlex CSI you’ll note the provisioner is different than vSphere CSI and instead of a storage policy, I set a storagepool and systemID which informs K8s where the pvc will be created.
apiVersion: storage.k8s.io/v1 kind: StorageClass metadata: name: powerflex-sc annotations: storageclass.kubernetes.io/is-default-class: "true" provisioner: csi-vxflexos.dellemc.com reclaimPolicy: Delete allowVolumeExpansion: true parameters: storagepool: sp1 systemID: 2ededaf70b9d0e0f volumeBindingMode: Immediate root@dsib0211:~/csi-scripts# kubectl create -f pf-default-sc.yaml storageclass.storage.k8s.io/powerflex-sc created
Now for the pvc manifest. Note that unlike vVols, when provisioning directly from the PowerFlex, best practice is to create volumes in 8 GB increments. If you choose a different size, it will automatically be rounded up to the nearest increment. You will also get a warning from K8s saying it failed to provision the volume with an invalid argument – after recording that the provisioning was successful. Basically the warning is that it could not use your size, but it still provisioned a similarly sized device:
Normal Provisioning 31s (x2 over 33s) csi-vxflexos.dellemc.com_vxflexos-controller-768b9687c9-c6qx2_1f4e83ca-4097-42d0-a27b-fner is provisioning volume for claim "powerflex/pflex-pvc" Normal ProvisioningSucceeded 31s csi-vxflexos.dellemc.com_vxflexos-controller-768b9687c9-c6qx2_1f4e83ca-4097-42d0-a27b-fisioned volume k8s-a6a4b326d9 Warning ProvisioningFailed 31s csi-vxflexos.dellemc.com_vxflexos-controller-768b9687c9-c6qx2_1f4e83ca-4097-42d0-a27b-fon volume with StorageClass "powerflex-sc": rpc error: code = InvalidArgument desc = Requested System 2ededaf70b9d0e0f is not accessible basty data, sent by provisioner
But back to the manifest where we will use an 8 GB device to avoid this:
apiVersion: v1 kind: PersistentVolumeClaim metadata: name: pflex-pvc namespace: powerflex spec: accessModes: - ReadWriteOnce resources: requests: storage: 8Gi storageClassName: powerflex-sc
It creates successfully with no warnings.
And now the volume on the PowerFlex:
Note the volume is not mapped to any SDCs since I am not using it yet with an application. But if I then create an app, say an nginx pod, I can tell it to use the available pvc and it will be mapped. I execute the following manifest telling it to consume the pflex-pvc we created.
kind: Pod metadata: name: nginx namespace: powerflex labels: name: nginx spec: containers: - image: launcher.gcr.io/google/nginx1 name: nginx ports: - containerPort: 80 protocol: TCP volumeMounts: - mountPath: /usr/share/nginx/html name: pflex-pvc volumes: - name: pflex-pvc persistentVolumeClaim: claimName: pflex-pvc
Here we see the nginx pod was created on my worker node 0213. The pvc we previously created has been mapped to the node and consumed by the pod.
By the way, if you are unfamiliar with vendor CSIs in VMware environments, know that when a volume is provisioned and then mapped to a node, VMware has no knowledge of this volume at all. It doesn’t show up as a vmdk, nor an RDM on the virtual machine. It is directly attached to the OS. With PowerFlex it is the SDC that facilitates it, other Dell arrays like PowerMax use iSCSI with VMs and their CSI.
With any luck this all made sense to you. The use case will generally dictate which CSI is more appropriate in your particular environment, but as I’ve demonstrated there is no harm in configuring both so they are always available.