PowerFlex Replication

I thought I’d take a couple posts to cover PowerFlex replication as it is such a critical component of most customers’ VMware environment. Having worked so long with PowerMax replication, the PowerFlex implementation in comparison is quite nascent, only supporting an asynchronous mode. In my experience this is the normal progression of replication development, however. VMware’s virtual volumes (vVols), for example, also only support asynchronous, with plans to add other modes in the future. The reasons for asynchronous first are pretty obvious – no requirement for an acknowledgement from the remote system which impacts latency, ability to have an extended RPO (though some vendors reduce that to sub-second), and the chance to do write-folding, or bulking changes together before sending the packets over, thus reducing network traffic.

Having worked with a number of the software solutions for replication, I find PowerFlex to be similar to RecoverPoint (RP). They both use the target device directly for testing, utilizing snapshots, though PowerFlex cannot rewind like RP. Through the PowerFlex GUI, you can execute both test failover and failover operations, and it is these same operations that are utilized by the SRA in VMware SRM. But let’s leave SRM for the second post and just look at PowerFlex replication first. I’ll also only be discussing version 3.x.

PowerFlex Replication Architecture

Replication with PowerFlex is controlled by a package/process – the Storage Data Replicator or SDR. You are no doubt familiar with the SDS, or the server process controlling disk access, and SDC, the client process allowing you to map volumes to hosts. SDR is another process but is responsible for replication. It is installed on the same hosts as the SDS, though it doesn’t need to be put on each SDS host (there’s a minimum number). The SDC communicates with the SDR which splits/duplicates the IO for a replication device so that the local SDR gets one and the other goes to the journal. The journal write is sent to the remote journal (with other writes), at which point the local write can be removed. The remote writes are sent to the remote SDR which forwards them to the remote SDS for disk write. The graphic below demonstrates this process.

Configure Replication

Before I talk about configuring replication, I do want to mention that the environment needs to be sized for the capability. Replication uses a journal as I noted above, and there is a calculation that you should make for each application that will participate in replication to be sure enough space is reserved for the journal. There is the network to be considered also, particularly if you are not going to separate out the replication traffic. My environment has a single network which is a production bad practice, but OK for a lab.

Setting up replication in PowerFlex is pretty easy. My environment is just two systems, but we do support fan-out, fan-in, though only ever two-site. There are two prerequisites: The first is to exchange root certificates between the systems, and then the second is to pair the systems together, very much like you do with VMware SRM.

To exchange certificates, generate a root cert on each system. On my first system, system_1, I login and then extract the certificate:

scli --login --username admin --password
scli --extract_root_ca --certificate_file /tmp/system_1

Next, I copy the certificate to system_2 and then add it:

scli --login --username admin --password
scli --add_trusted_ca --certificate_file /tmp/system_1 --comment Site-2

Do the opposite for the other system.

The second prerequisite is to pair the systems, or peer systems as they are called. To do this, you need the ID of each system. This ID is displayed when you login to CLI:

[root@dsib1059 ~]# scli --login --username admin --password  
Logged in. User role is SuperUser. System ID is 5a1517e21f14200f

Then login to PowerFlex GUI (aka Presentation server), and navigate to Peer Systems under the Protection category. Select Add and enter a name, the peer System ID, and the IP addresses to use for network traffic (note these IPs were set when you installed the SDR). I have the screenshots below:

And the result (note this screenshot is after I’ve added 2 consistency groups):

And really that’s the base setup. The next part is to add device pairs in a remote consistency group.

Consistency Groups

Consistency groups (or replication consistency groups – RCGs) are a logical grouping of devices that replicate together. All devices in the group work in concert for testing, failover, etc. Customers typically set them up based on application or perhaps purpose, e.g., development. Device pairs are added to an RCG where an initial synchronization is done before going into service. The RCG has certain attributes, like the RPO, but when setting up the device pairs there is a lot of flexibility such as using different storage pools in the same RCG. The one prerequisite is that the volumes must exist on both the local and remote systems. PowerFlex cannot automatically create them for you. So let’s create an RCG.

In the last image above you’ll see the ADD button so start by doing that. Then you can follow through the wizard I’ve included below. I name the RCG, set the source and target information, take the default RPO, then I add the replication pair. As I noted, the volumes need to be there.

Once you ADD AND ACTIVATE, the pair will initially sync and will be ready when Consistency is set to Consistent as below:

With replication running what are some things we can do with it?

Mapping

Let’s start with mapping. I did not present the volumes to any hosts prior to setting up the replication, but that is not a requirement. First, I’ll map the local volume to my protection vCenter and create a datastore on it.

Now I’ll present the remote volume. When we run any tasks on the remote volume, such as test failover, the volume must already be available on the hosts or the operation will fail. There is no automatic presentation. Unlike the local volume, mapping of the remote volume must be done on the RCG screen.

Since the remote volume is write disabled, there are two options for Access Mode, Read Only or No Access. For our VMware environment this must be Read Only.

You will get a warning about this which is expected.

We can consider the replication setup and configuration complete. Let’s see what the options are now.

RCG Menu

There are many options at the RCG level. Here is the menu:

Most are self-explanatory, though the Freeze can be confusing. Freeze is similar to Pause. Pause prevents writes from being shipped over to the target, while Freeze ships the writes over but does not apply them to the volume. But we’re going to explore the Test Failover.

Test Failover

Test failover is just what it sounds like, the ability to take a copy of the data and test against it at the remote site. The way PowerFlex handles test failover is by using snapshots. No additional volumes are required because PowerFlex will simply change the pointer(s) of the read only volumes to the snapshot, then make the volume read/write. Note that while test failover is active, replication is paused. I thought I’d do this one as a video. There is no audio or callouts as it is so simple, so the steps you will see:

  • Execute test failover on our test RCG
  • Run the datastore wizard in vSphere
  • Find the snapshot volume and mount it, assigning a new signature
  • <Testing would be done here>
  • Delete datastore (this one is important so that when the device moves from read/write to read only in the next step, you don’t get an inaccessible datastore)
  • End test failover

Test failovers are great for quick testing – maybe testing a patch or a new application. They aren’t a good option, however, for long testing because the writes accumulate on the source journal (recall replication is paused) which means it continues to grow – not only that but once testing is complete, the target has to catch up. Therefore, it is better to take a manual snapshot of the RCG and then map it to the hosts. I’ve include the steps below:

Using this method will allow replication to continue and prevents any impact on production volumes.

Next Up

In the next post I’ll show how the underlying PowerFlex replication is used with VMware SRM.

Advertisement

One thought on “PowerFlex Replication

Add yours

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Website Powered by WordPress.com.

Up ↑

%d bloggers like this: