SRM SRA logs missing/not created

We’ve had a number of strange support cases where the SRDF SRA log files do not appear to be generating, at all. These fall into two categories from what support has discovered. One is dependent on version, while the other can occur at any version (or with any SRA for that matter).

  1. SRDF SRA 9.2.0.1 – a hot fix version of the SRA fails to produce log files. My colleague Nick in support has seen I think 3 cases of this. Despite significant debugging, the solution is elusive and the customers had to upgrade to a different release to resume logging.
  2. SRA logs are not produced due to a full mount. Unlike #1 I should note that it is possible that the SRM services themselves stop also, though it depends which mount fills.

Since #1 is ongoing, let’s look at how to resolve #2 which is far more common.

Out of space

Running out of space on an OS is a fairly common occurrence, though I would argue that VMware backed themselves into a corner by designing the SRM appliance and the log mount so small. The current deployment size for a standard (non-light) implementation is 20 GB (2 disks, (1 ) 16 GB, (1) 4 GB). Compare that to a vCenter with 700+ GB of disks. More to the point, however, the log mount is only 380 MB. To me it seems foolish not to make the appliance and the log mount larger , especially since they could use thin vmdks and not even take up space in the datastore. VMware’s counter argument to this is that the SRA vendors are suppose to design their SRAs to cycle logs so that space is not a concern. But most vendors do not, so a full file system is possible if you do not manage your own logs.

Once the mount is full, what are the options?

  • Self-clean
  • VMware automation
  • Expand mount

Let’s look at each one.

Self-clean

Let’s start with the directory in question. The mount is /dev/loop0. VMware has allocated only 380 MB for the logs. This may seem sufficient, but VMware does not limit the size of logs that SRAs produce, therefore filling the mount is very plausible.

So if you aren’t getting logs, check the space available. Likely you will see this:

At this point, the user could choose to run a self-clean. Change the directory into each image as below and simply delete the files (rm -r *.log):

Space reclaimed. The drawback to this method is that you will have to monitor the space regularly.

VMware automation script

Fortunately, VMware offers a script which will both cleanup the files and use the logrotate Linux utility to prevent the full mount. You can find this KB at https://kb.vmware.com/s/article/78289

The script, clean-sras-logs.sh, is stored in the /opt/vmware/bin directory and set with execute permissions by root. Upon execution it will clean the logs and store them in a compressed file noted below.

But better yet, you can set the script to run automatically. If you want to execute the script each day (or you can change the schedule below), take these steps:

  1. As root create an empty file in /etc/cron.d
    • touch /etc/cron.d/sras.cron
  2. Add the following line in the file:
    • 0 0 * * * root /bin/bash /opt/vmware/bin/clean-sras-logs.sh
  3. Restart crond:
    • systemctl restart crond

And that’s it, no concern that logs will fill the partition again since the script will keep things in check.

Expansion

If you prefer to keep your logs around then you can expand the log mount. VMware offers KB https://kb.vmware.com/s/article/92956  to resolve this. I’ll run through it here, noting one incorrect instruction in it.

VMware directs the user to first shutdown the SRM appliance and take a snapshot. But then in step 2, asks the user to increase the size of each hard disk. Well that’s not going to work. VMware has a snapshot, so you can’t increase the disks. I suggest one of two options: Instead of taking a snapshot, clone the appliance before increasing the size of the disks; or simply take a snapshot after increasing the disks. Increasing the disks isn’t going to change your OS mounts in any way so you still have a good copy in case something goes amiss (I tested it here just to add a proof point).

Once you resize the two disks by 2 GB, power on the VM. Then run the following commands in order as root (or sudo as admin):

  1. losetup /dev/loop0
  2. dd if=/dev/zero bs=1MiB of=/opt/vmware/support/logs/srm/.SRAs.img conv=notrunc oflag=append count=2000
  3. losetup -c /dev/loop0
  4. resize2fs /dev/loop0

Here is my run showing the log directory is now over 2 GB, plenty of space:

So pick whichever method works best for you. I should mention that VMware says that the issue with filling the partition is fixed in SRM 8.2.0.1 though I’m not clear what is actually fixed. I’m running 9.0 and I still have this problem and the KB I directed you to is not very old.

 

One thought on “SRM SRA logs missing/not created

Add yours

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Website Powered by WordPress.com.

Up ↑