Today we have some good news for those of our customers who use VMware SRM, the SRDF SRA, and the vCenter Virtual Appliance. Let’s start with some background information for context, though I am going to assume the reader is familiar with these technologies already. The EMC SRDF SRA is the Storage Replication Adapter which integrates with VMware SRM to automate testing and failover of your EMC SRDF environment. It is available for all flavors of the VMAX array from our older VMAX2 line, to the VMAX3 and VMAX All Flash. We support almost all modes of SRDF including Metro and Star configurations. The SRDF SRA (going to use SRA from here on out) integrates seamlessly with SRM for failover operations. For testing operations, however, the SRA requires the use of XML files. These XML files tell the SRA that when a test is run on the recovery site, it should uses certain devices as copies of the production environment. In its simplest form this would mean in a 2-site replication, I have an R1 volume locally (production site) and an R2 volume remotely (recovery site). The XML file would contain all the R2 devices paired with a device that could be used with our TimeFinder software (be it snapshot, clone, etc.). Now these XML files can be manually created or they can be modified with the SRDF Adapter Utilities (SRDF-AU), a plug-in to the vCenter. Unfortunately that plug-in only works if your vCenter is on Windows and since VMware has allowed the use of the VCSA with SRM for some time now, and is moving to it permanently in the future, it means if you use VCSA, you are modifying these files by hand. I’m not going into great detail here on the SRDF-AU because I’ve talked about it quite frequently on this blog, but modifying the files by hand can be tedious and certainly is more error prone than generating one with a utility. BTW all this is documented in the TechBook.
So now let’s take a short commercial break to cover some important ass-pects (like mine). If you have never taken time to read my disclaimer at the top of the blog — and really who does — let me familiarize you with the boilerplate words that many blogs which focus on a particular company/technology, use. What these disclaimers invariably point out is that while we (I) write about certain technologies, we (I) don’t in any way represent the company. In other words, Dell EMC does not see, read, or approve my blog posts. Yes, I work at that company but anything I write here is my own, and any information I provide on this blog be it steps, scripts, or personal documentation (unless said documentation is hosted by Dell EMC), is my own. That means if you open a support ticket with Dell EMC and say “well I was following steps or a script I found on Drew Tonnesen’s blog and things broke,” support is likely to say “only follow supported documentation.” I’m not saying of course support won’t help you or that I give you bad information (because I don’t), but you do always want to use the supported documentation and I frequently say that in my blog posts. So this is a reminder to you that what we are now going to talk about is not supported. You cannot open an SR about it, nor complain to Dell EMC. It is provided as is, and if you want to use it, go for it but you do so at your own risk. OK glad we got that straight…
Back to the program. As I was saying, customers using VCSA and the SRA are in a bit of a bind. It’s not that there is any functional tie between the SRDF-AU and the SRA (the SRA works fine without it), but no one wants to modify files by hand. The first thing you will ask is why isn’t the SRDF-AU being ported over the VCSA? The answer is it cannot be. It was written many, many years ago with old code and is so tied to Windows components that we would have to create a brand new plug-in and while we may do that in the future, right now it isn’t in the cards for a variety of reasons I cannot go into. So what to do? Well I’ve had enough customers *gently* emphasize that this is not acceptable for me to start thinking about a solution. And though the SRDF-AU has a fancy interface, in the end it is just some calls that are generating an XML file. I’m pretty sure there is an SDK on the VMware side and some code (SMI-S) we can leverage on the Dell EMC side to get the information we need. So I got to work…and called a friend in development and said you got someone who can code in Python? And my new best friend Sam Gass said yeah I can do that. Yeah big surprise I’m not a developer. Powershell, PowerCLI, yeah maybe a little but not Python. Anyway Sam was the man and was able to wade through the SRDF-AU code to pull out some nuggets (and from what he told me it was like trying to get bubble gum out of hair). He leveraged that and then I pulled in our SRA team who helped with some important logic and lo and behold Sam had a script that would generate the EmcSrdfSraTestFailoverConfig.xml file. I took that, did a bunch of testing, he tweaked it for me here and there so it did what I needed it to do and now the script is solid. How did we end up with Python? Well that can be run on most platforms and since all we need as input is the vCenter IP and an SMI-S IP, it doesn’t matter at all if you have a VCSA or Windows vCenter.
Now a little more about the script. The design of the script is such that we expect it to be a starting point for customers to write their own code. It is not meant as a replacement of the SRDF-AU plug-in because it can only achieve one function of that plug-in – the generation of the test failover xml file. The SRDF-AU can configure all the options of the SRA as well as create consistently groups for your RDF devices if you need them. This means that if you need to use any of the other functions available in the SRA such as Gold Copies, setting the parameters in the Global Option file, or using the masking capability, you must use the SRDF-AU or modify the files manually.
The script has some pre-requisites as well as important notes you should be aware of. Let’s start with the notes.
- Know Python. I learned enough in the past month to use it so you probably can too, but there are no tutorials with this script. If you want to use it, you’ll need to know Python or like me, find someone how does and can explain it!
- It is only designed to be used in a 2-site configuration. It has not been tested with SRDF/Metro or any 3-site solutions.
- The SMI-S environment used must only see the recovery array as local, otherwise the script will fail.
- If the SMI-S environment has not been updated within the last 15 minutes, the script will fail. SMI-S will refresh every hour automatically, but if you need to refresh manually, execute TestSMIProvider and use the refsys command. The reason we did this is to ensure if you make last minute changes to the configuration (e.g. add pairs), SMI-S knows about it. If you have a static environment, you can easily change the time in the script to something longer so it doesn’t prevent you from running it.
- The script was tested with SRDF SRA 6.2, SRM 6.1, SMI-S 8.3, and vCenter 6.0. I suspect you can use an older environment but use the most recent SMI-S.
- We use SnapVX/NOCOPY as the replication type/mode. If you have to use emulation mode, you can adjust the script so the XML file that is produced is correct.
- Though the XML file is created automatically, you will still have to copy it to the appropriate directory on your SRM recovery site.
- The script is written to generate a version 6.2 SRA, i.e. it is hardcoded to write-out: <Version>6.2</Version>. With future releases, you should adjust the version in the script, e.g. 6.3.
- The script will generate pairs for every device in your SRM environment. The nice thing about the SRDF SRA is that it doesn’t care if you have extra pairs in the Test Failover XML file. It only looks for the pairs it needs for the recovery plan. Now there is one drawback to this. If you have lots of R2 devices at the recovery site that are not used with SRM, the script is going to fail unless you have device targets presented to the vCenter for those devices. There are ways to get around this with a little thought but we leave that in your capable hands.
- Python 2.7 with packages pywbem, pyVmomi, and lxml. Of course some Python installs might be missing other libraries but generally this is what we found is required. BTW I used Ubuntu and had to install a lot more. Nice thing is Python will always tell you what you are missing when you run a script.
- Both the R2 devices and the target devices (SnapVX targets) must be presented to the recovery site. As I mentioned we can’t do automated masking. If the device is not seen by your hosts in the vCenter, the script will not find them.
- An SMI-S environment listening on a non-SSL port. By default, SMI-S 8.3 listens only on SSL 5989. If you need help modifying SMI-S to listen on non-SSL see this post.
- And finally, you will have to modify the script to put in the variables for the Recovery vCenter, SMI-S, copy type and the file name. Here is the top portion of the script that requires updating:
SMIS_IP = “http ://192.168.1.1” (extra space intentional to avoid WordPress error)
SMIS_PORT = “5988”
SMIS_USER = “admin”
SMIS_PASS = “#1Password”
VCENTER_IP = “192.168.1.2”
VCENTER_PORT = “443”
VCENTER_USER = “email@example.com”
VCENTER_PASS = “password”
COPY_TYPE = “SNAPVX”
COPY_MODE = “NOCOPY”
FILENAME = “EmcSrdfSraTestFailoverConfig.xml”
I made a gif of the script in action. Click to enlarge in a new screen.
Without further ado, here is download on GitHub:
That’s it. Though this script is not supported, Sam and I aren’t heartless 🙂 If you find something wrong with the script – say it doesn’t work right in your environment despite folhttps://github.com/adahn6/sra-failover-script/blob/master/failover.pylowing all the pre-requisites, file an issue on GitHub and when I can I’ll test your issue and in his spare time Sam will update the script since he is gracious enough to host it there. Please refrain from reaching out to Sam directly, we promise we’ll see the issues. Alternatively, if you are a coding guru and modify the script to do more and wish to share it, I’d be happy to put a link to your GitHub in the post for others. As I mentioned in the beginning, hopefully a new plug-in is coming in the future but I know there is a cool new change to the SRA functionality which I’ll actually demo at Dell EMC World if you are going.