PP/VE 6.1, SRDF/Metro and Autostandby

PowerPath Virtual Edition (PP/VE) 6.1 is now available for download from support.emc.com. This release offers a new feature which has been available previously on VPLEX Metro configurations, Autostandby or ASB.  When enabled (default), ASB is able to determine which device paths have a higher latency to storage, automatically assigning them to a standby status.  This capability is specifically geared toward those customers who want to run an SRDF/Metro configuration with cross-connect (x-connect) so that every host sees both arrays.

Before I get into how the feature works, here’s my PSA on SRDF/Metro with x-connect. While x-connect sounds great – all my hosts see both arrays – using it does have implications. X-connect adds to the complexity of the SAN design (bridging between sites) and can lead to zoning issues. Losing the data center will also generate SAN fabric events which will impact surviving hosts, and perhaps VMware HA taking place. Odds are your array is not the component that is going to fail, rather it will be the servers. A VMAX is incredibly resilient and in fact in the event of RAID failures in SRDF configurations, it can even read from the remote array. Let’s face it, its resiliency is why you bought it right? Keeping your two environments isolated will reduce complexity and minimize any chance of one impacting the other. And finally, and most importantly, you will be adding latency to your application unless your arrays are right next to each other – whether using active/active access or in the event of failing over. Of course you may be aware of all these things and still want to use it, so on to the feature.

I’m going to employ my environment to demonstrate this, but first let me explain the defaults of PP/VE for VMAX arrays. On the ESXi host, run powermt display options after installation to see the defaults.

options

Click to enlarge – use browser back button to return to post

Autostandby has 2 different modes it can use: proximity and IOs per failure. The first is proximity based autostandby which is the default and labeled as asb:prox on a path. This is the mode most pertinent to SRDF/Metro as it can determine which paths are remote and which are local. In addition, proximity based autostandby has a secondary option you can supply called threshold. By default the threshold is zero (0), however it can be set to a value from 1-5000 microseconds. With the threshold, as long as one of the paths is above the threshold, PP/VE will set asb:prox on the paths with the higher latency. If all paths are below the threshold, then they will be set to active, and none to asb:prox. The idea with the threshold is that if you can tolerate a certain latency in your x-connect environment, then you might want all the paths to be active and therefore should set the threshold higher.

*************NOTE:  At the default value of zero, the threshold guarantees that if you are running cross-connect, one of your array paths will be set to asb:prox. If you want to be sure all paths are active, either disable autostandby (see below), or set the threshold to a high enough value to ensure latency will not go beyond it. In a future version we are going to disable autostandby by default so that only customers who wish to use the feature will experience the automatic setting of asb:prox.

The second mode is IOs per failure and labeled as asb:iopf. IOs per failure is most useful in situations where there is the potential for “flaky” paths. When the path has ‘x’ number of failures, the path goes into standby for the aging period (which also can be adjusted). This option is on by default.

To demonstrate how autostandby works I’ll use my VMware Metro Storage Cluster setup which has two hosts, each attached to its own array. The table below has all the information necessary. Note that I include the FA ports so that you know which array is local and which is remote.

So with array 535 being the R1 it will supply the external identity to device 44 on array 536, so both devices share WWN 60000970000196700535533030303341. I completed the steps required for x-connect – cabling, zoning, and masking (my environment is fairly simple) and then installed PP/VE 6.1 on my hosts. Once my devices are presented and claimed by PP/VE, it will automatically determine which paths are local to the host and which are remote. Here is device emcpower46 on dsib1139. Note the FA ports and compare them to the environment above:

Click to enlarge – use browser back button to return to post

Notice that two of the device paths have been set to asb:prox – or autostandby proximity while the other two are active. What this means is that PP/VE tested the paths (with default zero threshold) and determined that the paths going to device 44 on array 536 had a higher latency than those going to device 3A on array 535 and set them accordingly for dsib1139. And on dsib1140, the opposite settings for emcpower19:

dsib1140

Click to enlarge – use browser back button to return to post

During normal operation, only the two active, local paths will be used, but if there is a failure on array 535, IO will be redirected to the asb:prox paths. As I mentioned if I felt that I could sustain a higher latency on the remote paths, I could do the following. First, I would need to disable autostandby as you cannot adjust the threshold on the fly:

powermt set autostandby=off trigger=prox class=symm

Then I would turn it back on, specifying the new threshold. When specifying the threshold it is necessary to supply the class which for our case means VMAX devices (symm):

powermt set autostandby=on trigger=prox class=symm threshold=5000

Then check on our previous emcpower46 device to see the new path settings:

threshold

Click to enlarge – use browser back button to return to post

Now all my paths show as active since the latency to both arrays was lower than the threshold. If at some point during normal business operation I believe that one of the arrays is causing latency issues, I can issue a reinitialize which will cause PP/VE to re-test the paths and set any to asb:prox that exceed the threshold:

powermt set autostandby=reinitialize trigger=prox class=symm

For customers who have a true stretched cluster where the arrays are co-located and already know they do not wish to use autostandby, simply turn autostandby off as shown above or set the threshold to a high value. Note that though we recommend PP/VE, this type of environment would be perfectly fine with NMP (* be sure that IOPS is set to default of 1000 with RR for x-connect configurations). Note if you want to do standby with NMP you would need to use Fixed path.  I wouldn’t recommend that configuration and not simply because I work for Dell EMC. If you want to use x-connect with arrays at distance, use PP/VE for best results.

One final configuration I want to cover is a customer who has a stretched cluster (or co-located arrays) who nonetheless wishes to only use the secondary array as a standby. This environment is a bit tricky for autostandby because as I have shown, PP/VE will set paths to asb:prox in the default configuration and it may not choose the R2 device on the R1 side or the R1 device on the R2 side or if the latency is exactly the same it will set all paths to active. The best way to handle this is to not use autostandby but rather set standby manually for the paths (turning off autostandby as shown above). The command to do that is:

powermt set mode=standby hba=<hba#> dev=all class=all

Here are a couple of examples:

rpowermt set port_mode=standby dev=vmhba2:C0:T0:L7 
 
rpowermt set mode=standby hba=1 class=symm

If you want to keep autostandby on, however, you can manually change the paths to active or standby but you must “force” it. In this screenshot I change the two autostandby paths to active.

Click to enlarge – use browser back button to return to post

Now I set the original paths that were active, to standby. Note that I did not have to force the change as they are not in an autostandby state.

change_to_standby

Click to enlarge – use browser back button to return to post

Be aware that the command I ran will change all paths to standby that meet my criteria. For the most part, that will be desirable. One final caveat I will make and that is if you use autostandby, and it improperly assigns the path (R1 to autostandby), any non-Metro devices presented from the R1 array also will be set to autostandby. Since all paths have the same value, PP/VE will still use them despite being asb:prox, but you most certainly would want to change them to active. Here is an example where my local, non-Metro device paths are now on autostandby.  To fix, simply change the paths to active, remembering to use the force option.

wrong_local

Click to enlarge – use browser back button to return to post

Unfortunately all these changes you just made manually do not persist through reboot. There’s no arguing that’s a big flaw and a real pain point. PP/VE is currently working on making it possible to save the settings but for now if your host goes belly up you’ll need to redo everything when it comes back.

As far as non-PP/VE documentation, I’ve been bogged down with the HYPERMAX Q3 release and ESA and VSI so I have not had a chance to update either the TechBook or my Metro paper with PP/VE 6.1 information. As I’ve noted before that is a big reason I do these blog posts – to get the information out in lieu of the immediate doc updates. I haven’t decided how I’ll incorporate this yet because I also have the new Metro 3-site configuration to document in the Metro paper (though that is documented in the SRA TechBook). In any case the information is here at the present if you require it.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Blog at WordPress.com.

Up ↑

%d bloggers like this: