As there is a 500 page TechBook on SRDF with SRM, it might seem a bit redundant to be covering a topic like this again, and perhaps it is; but as I have worked with customers and support over many months (and years), a few issues seem to recur. It’s usually at that point I figure a post might help just to emphasize what is already in the documentation because I agree you can’t be faulted for not reading it cover to cover (honestly I try to avoid it, too).
Our SRDF SRA version 5.8 and higher supports most 3-site SRDF topologies. This includes both Star (3-sites with interwoven relationships) and non-Star. Starting with SRA 6.2 we also support SRDF/Metro with 3-sites in non-Star. Putting aside Star which you really want the TechBook for, I want to focus on a typical 3-site SRDF SRM environment.
A non-Star 3-site environment consists of 3 VMAX arrays – let’s call them VMAX A, VMAX B, and VMAX C. SRDF is configured first between VMAX A and VMAX B and pairs (R1, R2) setup to replicate synchronously (SYNC) or asynchronously (ASYNC). (The SRA does not support adaptive copy.) Once this first relationship exists, a second replication is configured on the same pairs. There are two types of supported replication: concurrent and cascaded. A concurrent setup means we replicate the R1 to the VMAX C while a cascaded setup is replicating the R2 to the VMAX C. Here’s a mock-up of the 2 different types.
There are some important differences I need to point out between a regular 3-site environment and one that will use SRM:
- Note that the only supported replication to the 3rd site is asynchronous. You cannot use adaptive copy or synchronous. The SRA will not work with either.
- In concurrent setups, the R1 becomes an R11; on the cascaded the R2 becomes an R21.
- Due to the way the SRA was coded, in cascaded setups the VMAX A must see the VMAX C (this is not required outside of SRM). In other words the arrays must be zoned and an SRDF group (empty) created between them. If this is not done, VMAX A and VMAX C will not be available as an array pair to enable in SRM. For many customers this is not an option and unfortunately until this issue is addressed in a future SRA, a concurrent setup is the only choice.
- SRDF/Metro is supported with 3-site but still has this cascaded requirement.
These caveats noted, what usually trips-up customer configurations is how to configure SRM differently depending on whether using the VMAX B or VMAX C as the recovery site. So here is a quick rundown of the differences.
First, the SRM protection site must be the R1 or R11. Now some customers will have some R1s on both the VMAX A and VMAX B and use different protection groups, and that’s fine, but the majority will use each array for only one purpose. My point is more that you can’t use an R2 as the starting point of the protection. So in our picture above, I can’t setup SRM to failover from VMAX B to VMAX C (a common mistake). I think where it can get most confusing is with SRDF/Metro where you have an active-active configuration so it might appear there would be no difference in which device you would use at the protection site, BUT it does. You must use the R1. At this point it might be useful to mention that this does not impact the paths you present to the hosts. In particular if you run a cross-connect SRDF/Metro you will have both R1 and R2 presented to the same host(s). That’s perfectly fine, but it does not change the requirement that you use the R1 in configuring SRM.
Now that we’ve established the R1 is the protection site, or the VMAX A in our picture, what are my options with SRM in 3-site? The answer is you have 2 options. The first is to setup a typical R1 to R2 failover. In this configuration you are designating the VMAX B as the DR site and the replication is either asynchronous or synchronous. Basically you are configuring the environment as if the VMAX C was not there. If you ever wanted to failover to the VMAX C you would do so manually. The other option is to setup SRM to failover to the VMAX C. This second option is where problems usually arise so I’ll expand.
The most important step to take when you want to use the VMAX C as the DR site is to modify the global options file (EmcSrdfSraGlobalOptions.xml) that the SRA uses to determine its behavior. There is a global setting created just for this scenario called FailoverToAsyncSite. By default this is set to “No”, but if you want to failover to the VMAX C you must set it to “Yes”; furthermore you must set it on both the protection and recovery sites. This should be done before you even start configuring SRM. If you don’t do this you’ll never get it to work.
The second step concerns the array manager setup. Issues with this are usually the result of a misunderstanding of how and where devices are presented to hosts. In a 3-site configuration you will have either 2 or 3 vCenters. In a 2 vCenter setup, customers may choose to have no vCenter (or compute resources) at the VMAX B site (sometimes referred to as the “bunker” site). Since the VMAX B is not being used as a failover location you really don’t need a vCenter there; but you can of course have one which would be a 3 vCenter setup. In SRDF/Metro environments you may have only 1 vCenter for both VMAX A and B, and then the second at VMAX C. When you setup the array managers, then, you want to use a Solutions Enabler environment that sees the VMAX A only as local, and Solutions Enabler environment that sees the VMAX C only as local. You do not have a Solutions Enabler environment for the VMAX B (or not one for SRM anyway). Your recovery site is the vCenter where the VMAX C devices (concurrent or cascaded) are presented. If this step is done properly you will be presented with the VMAX A to VMAX C array pair to enable. If you don’t see the array pair you expect, you have either hit the cascaded issue above, or you are not using the correct Solutions Enabler setup.
Here is a screenshot of my VMAX A (103), VMAX B (104), and VMAX C (062). Note in particular that the only pair I can enable is my VMAX A to VMAX C. Since I have no Solutions Enabler for VMAX B (104), I cannot enable that pair. This is as it should be.
If anything needs clarification feel free to drop me a comment. I hope this helps next time you need to configure 3-site SRDF with SRM.