NMP latency best practices with VMAX and PowerMax

…continued from my previous post.

So at this point you’re probably asking, do I use this new feature and if so when do I use it? Well there is a perfect use case for type=latency and that is uniform vSphere Metro Storage Clusters using SRDF/Metro. Recall that uniform configurations, or cross-connect, zone and map hosts to both storage arrays in the environment. While our best practice is non-uniform configurations (other posts I have explain why this is so), some customers insist on uniform. These customers usually fall into one of two camps:

  1. They want failover to be seamless without the need of redirection (network re-routing), but do not necessarily require all paths to be active.
  2. They want to harness the power of both arrays, period.

For those customers in the first camp, we recommend PowerPath/VE using Autostandby as that will always use local paths unless there is a failure. In essence you get the benefit of a uniform configuration while running non-uniform. For customers in camp 2 we also recommend PowerPath/VE because of the advanced algorithms which are more likely to select the local paths over the remote ones. Of course the reality is not every customer uses PowerPath/VE and as NMP is the only other option, it’s important to provide recommendations for that pathing software.

NMP has always been a challenge in uniform configurations because there was no intelligence behind selecting a path. The default (and recommended) path selection policy (PSP) is Round Robin based on iops which moves from path to path, usually after a single IO on each path (VMware and Dell EMC best practice). To help alleviate issues with NMP in a uniform vMSC, however, Dell EMC best practice is to set iops=1000 (the VMware default) instead of 1. This reduces contention between the arrays and improves performance, though not to the degree of PowerPath/VE, particularly with Autostandby.

The contention issues are why the introduction of the new type=latency is particularly useful for customers running uniform vMSC NMP configurations. VMware will now test the paths, and if the arrays are any distance apart (setting aside co-located arrays), VMware will send most of the IO down the local paths to the local array. In practice, it turns a uniform configuration into a mostly non-uniform configuration, while still maintaining the remote paths. Well this sounds promising right? The proof of the pudding is in the eating as usual. So let’s talk about results.

Co-located arrays

As a baseline, I wanted to start with a uniform vMSC with co-located arrays (same datacenter), with no latency between them. Even in this configuration, an NMP Round Robin iops=1 can cause contention. I ran lots of different tests comparing iops and latency types. In addition to seeing how latency performed, it gave me a good opportunity to re-test the iops value. While we use VMware’s default value of 1000, there was some discussion internally that other values between 1 and 1000 might prove more beneficial. In fact, values beyond 10 iops generally produced similar results, all better than iops=1; however, in the end, as iops=1000 produced similar or better results than the others, the recommendation will remain 1000 so that customers do not have to make any changes (well if they are running vSphere 6.5 or lower).

Here is a graph of the results of testing the following 3 uniform vMSC setups with co-located arrays:

  • type=iops=1
  • type=iops=1000
  • type=latency

I’ve included the total IOPS in the legend for completeness.

The results clearly show that as expected iops=1 performs the most poorly, both in terms of total IOPS and response time, while iops=1000 and latency are very close in performance on both accounts. It appears that iops=1000 performs a fraction better than the latency setting, which I think is expected as there is no latency between the arrays and as minor as the overhead is with the latency algorithm it is not unrealistic for it to add something over the dumb terminal aspect of the Round Robin iops setting.

Distance simulator

I was then fortunate enough to get my hands on an environment with a distance simulator that was mimicking a 50 km distance between arrays (we support about 100 km on average). This type of uniform vMSC setup is one that can cause real contention and loss of performance no matter what the iops setting is because invariably some IO requests are going over distance. As I noted it is best if hosts use their local array as in non-uniform configurations. I ran more detailed testing in this environment than the co-located arrays as I was far more interested in how the type=latency would respond to the distance. Also note that because the environments themselves were different (e.g. arrays, switches) I do not intend a comparison of these results to the co-located arrays. I purposely used different Iometer tests, too.

I began with another baseline test. For this one, I compared running our best practice of type=iops and iops=1 with only local paths (non-uniform) against type=latency with both local and remote paths (uniform). In essence what I wanted to see was whether VMware would choose the local paths over the remote paths based on the latency algorithm. The results were very encouraging. You can see in the graph below that using NMP with type=latency in a uniform vMSC, generates the same IOPS as NMP with type=iops in a non-uniform vMSC.

These baselines were enough to convince me of the benefit, but I also wanted to run a true uniform vs uniform to see how much type=latency would help in comparison to type=iops. For this test I used multiple VMs to generate more load.

First I started with type=iops and tested different amounts between 1 and 1000 to see if 1000 was still the preferred value, which incidentally it was (yeah for consistency). I then moved on to testing type=latency against iops=1000. All results were very encouraging for VMware’s new type. As the graph here illustrates, using type=latency had better response times and a significant IOPS benefit (shown in legend) over iops=1000, to the tune of about 14% better IOPS.

I’m enough of a true believer at this point to change our best practices for NMP with uniform vMSC from type=iops=1000 to type=latency (using defaults) if you are running vSphere 6.7 U1. I will update my documentation over the next couple months to reflect this change. I’m also in the middle of updating the VMware KB article for vMSC with SRDF/Metro because in general it is horribly out of date, and will add this.

Non-uniform, non-vMSC

OK so how about non-uniform vMSC environments, or just regular VMware environments that use NMP instead of PowerPath/VE? Should you be using type=latency there? Right now we are not recommending using latency as the default type for all NMP paths. I have done some testing and generated some numbers in different environments, but everything is preliminary and it would be premature to advise a blanket default change other than in uniform vMSC (which was my first concern). As I explained, when using type=iops there is no algorithm involved, no extra code to navigate, and for now iops=1 is still going to be the best practice. I will add, though, that type=latency can benefit an environment where there is frequent storage network congestion or events like flaky paths. I have no problem advising that it is perfectly fine to use type=latency in these circumstances because even if there is a tiny overhead, it would be outweighed by the gain in IOPS. I would make an exception for target (e.g. FA queue) congestion. If that is your issue, definitely get Dell EMC involved to help resolve it rather than “pathing” around it. There are better resolutions on the array which will increase output.

As I continue to test, if I become convinced that type=latency proves as good as iops=1 in most environments, I’ll update my findings and probably update our default selection in the VMware ESXi code.


One thought on “NMP latency best practices with VMAX and PowerMax

Add yours

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Website Powered by WordPress.com.

Up ↑

%d bloggers like this: