Reducing snapshot VM stun time: VVol VASA Provider upgrade

One of the features of Virtual Volumes (VVols) is that when you take a VM snapshot it uses the array technology instead of VMware’s host-based solution. If you use VMware with VMFS or NFS you are probably quite familiar with the sight of VMware’s snapshot files which include a .vmsd file, a couple vmdks – base file and delta disk, and perhaps a .vmsn which is the active state of memory (VMware copies the memory by default). Now whether you use VMFS/NFS or VVols, this is the snapshot dialog box that appears:


By default VMware is taking a copy of the memory. By doing so, VMware can return your VM to its exact active state (up and running) if you restore the snapshot. It does this by “stunning” the VM for a short time while the copy is made to ensure consistency. A stunned VM will hold I/O until VMware releases the stun. Any application accessing the VM will essentially pause also (I/O is held after all, not rejected) during this time. (The other option to quiesce the Guest OS ensures that the FS is also consistent when the snapshot with memory is taken.) Alternatively, if you uncheck the memory box, then you will get a crash-consistent copy of the VM. All this is true regardless of the type of VM storage you are utilizing. That’s one of the great things about VVols, the way you interact with the VM is the same. Depending on your business needs, you will likely use one or the other type of snapshot for a particular VM. A production VM you want to backup may require the memory, while a development VM just needs the crash-consistent version.

As I mentioned, VVols is using array technology to take the snapshot, in the case of the VMAX/PowerMax it is SnapVX. Because of this, when we take a crash-consistent snapshot, unlike VMFS we actually don’t create any additional files for the vmdks. We can use a targetless snapshot. This is why crash-consistent snapshots are very quick. When a snapshot with memory is chosen, however, we need to create a snapshot with a target, because we need the active state of the memory. Unlike say a VMFS snapshot, though, we aren’t just creating a file on a file system for that memory dump. We have to create a new VVol on the array and then copy the memory to it. Furthermore, we need to do all this while the VM is stunned, again to ensure consistency. These extra steps can cause the VM stun to last longer than expected, and longer than can be tolerated for some customers, close to 30 seconds is possible.

We recognize this impediment to business and therefore development re-worked the snapshot code to take better advantage of the array technology and has now delivered a new VASA Provider release. This release will greatly reduce the VM stun time, though it cannot eliminate it of course. The new version, VASA Provider 9.0.0_502, can be found on our support site here:

Note that the upgrade is optional, but recommended if you are taking snapshots with memory. The changes will be incorporated in the next official VASA Provider release if you wish to wait. I’ve also written an official Dell EMC KB on this:

535871 : VVol snapshots with memory cause IO to the VM to pause for an excessive time on VMAX3, VMAX AFA and PowerMax https://support.emc.com/kb/535871

Advertisement

One thought on “Reducing snapshot VM stun time: VVol VASA Provider upgrade

Add yours

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Website Powered by WordPress.com.

Up ↑

%d bloggers like this: