I’m not sure this topic requires a full post, but if you don’t work with NVMeoF technology much, you may assume that it is just a different way to connect to your storage and everything else functions pretty much the same as it does with FC/iSCSI connectivity. Spoiler – it doesn’t. Here I am interested in taking a short dive into NVMeoF and SRDF.
Host vs PowerMax Connectivity
The majority of the confusion I see revolves around the difference between how the hosts talk to the PowerMax and how the PowerMax talks to another PowerMax. It’s probably best to start with a simple diagram I put together showing a typical production/disaster recovery solution using SRDF. The two colors to concentrate on here are the orange and green network lines.
At the top we see the NVMe/TCP connectivity between our VMware ESXi environment and the PowerMax. I’ve covered how that connectivity is setup and devices accessed here. As you’ll recall, NVMeoF is a set of specifications which detail how NVMe devices are to be accessed in an equivalent manner as traditional SCSI (I’ll use the term SCSI to represent FC or iSCSI connectivity). On the PowerMax 2000/8000 you have FC-NVMe instead of FC, and on the PowerMax 2500/8500 you have TCP instead of iSCSI. (There are other NVMeoF flavors of IP like RoCE but we’ll stick with what we offer.) Not all SCSI commands are ported to the new specs so there is some translation but the idea is equivalency.
Now we move to the green connectivity between the PowerMaxes. This connectivity has not changed – we offer GigE (IP) or FC. There is no NVMeoF for SRDF. Therefore how you access the devices through the host, whether traditional SCSI or NVMeoF, does not bear upon the replication. And, technically, one PowerMax does not dictate host connectivity to the other. This means I could use NVMeoF on my hosts in production for the R1 and SCSI on my hosts in the disaster recovery site for the R2…well not for VMware but hold that thought.
SRDF
When you go setup SRDF on your PowerMax you will continue to do it exactly as it has always been done. Replication is independent of host access. Even putting aside NVMeoF, think of eNAS or File where we use SRDF – those devices are accessed via NFS, not FC or iSCSI. vVols also provide another use case. When you protect a storage group that is presented via NVMeoF, you get the same wizard as SCSI because RDF does not check/care.
Metro
However, there is one SRDF mode which you cannot use with NVMeoF – Metro. Returning to the specs, the consortium (those who develop the specs including Dell) has come up with something called “Dispersed Namespace” (recall a namespace is just a device with NVMeoF), which is what would permit active/active NVMeoF devices; but we and VMware are still in a development phase. That’s why my architecture diagram above does not include active/active. Metro is a special case because the host (ESXi in this case) must know how to deal with a single NVMeoF presentation. Since that presentation doesn’t exist yet, you can’t do it. Now ask me if the wizard prevents this – well I’ve shown you it doesn’t. SRDF in Unisphere works at the storage group level so how it is presented to a host is not relevant. Just keep this in mind. Your host vendor, e.g., VMware, will tell you this also, but save yourself the headache and don’t try it.
Just to illuminate the reason for this, remember that Metro works by duplicating the external WWN across two arrays. So when you present the device from either array, VMware sees the same WWN and thus considers it the same device. With NVMe/TCP, the external WWN is not exactly what VMware uses. Instead ESXi generates what is called an NGUID or a Namespace Globally Unique Identifier (that uses the EUI64 16-byte designator format). This NGUID is the ID after the eui identifier. For example, here is a device I presented through TCP, 00167. Note the usual WWN in the red boxes, the external one is traditionally used in Metro.
Once VMware sees it, however, it generates the NGUID, and though it includes all the digits of the WWN, it does not match the string seen in Unisphere:
And therein lies the issue. That NGUID is what will need to be supported in VMware and by us across arrays.
VMware and NVMeoF Datastores
I want to come back to the ability to use NVMeoF on one member of the SRDF pair and SCSI on the other when presenting to hosts. I mentioned that technically it’s fine. Presenting an R1 with NVMeoF in Production and then the R2 with SCSI in DR will not change the contents of the device. This is not really true for VMware though. Because VMware is using that ID of the device in the datastore creation (remember RDM is not supported with NVMeoF), and the fact that some metadata may be different if I created a VMFS on SCSI vs VMFS on NVMeoF, VMware says you can’t do this. Forget SRDF for a minute, if you’ve read my posts on migration with NVMeoF and VMware, you’ll know I say explicitly that VMware only supports SvMotion to go from SCSI to NVMeoF. You can’t simply re-present a previous SCSI datastore with NVMeoF. Will it look like it works? Yes. Will you have issues? VMware says yes. So SRDF is the same principle. VMware may change their stance in the future – like say a resignature is enough – but until then you have to be consistent across environments.
SRM, SRDF, and SRA
Nope. VMware has not enabled SRM with NVMeoF yet so we can’t support it. VMware said it’s coming. So if you want to use NVMeoF with VMware in a DR environment, failover is manual I’m afraid.
Future
I think we are likely to see a bunch of changes in NVMeoF in VMware’s next big release. I don’t think I’m surprising anyone to say it should come by the end of this year if all goes to plan.
Leave a Reply