Monday, March 18, 2013

VNX FAST VP and Storage Pool free space

FAST VP auto tiering and slice relocation is something most storage admins consider when you actually have pools that contain mixed disk types, but are often overlooked when a pool only contains one disk type where there is no option for data to move up or down.  If your array has pools that contain only a single disk type (example, only 15k SAS), FAST VP still runs it's normal tiering policies against homogeneous pools and will attempt to relocate data slices to disks in private RAID groups that less utilized than other disks in the pool.  This is done as a "load balancing" process as data ages and runs through it's normal life cycle.  More info on FAST VP load balancing can be found on page 10 of the EMC FAST VP white paper (Powerlink Login required).

I recently uncovered several slice relocation errors happening daily on a VNX 5300 with the FAST VP enabler installed.  The syntax of the errors contained "Could not complete operation Start Relocation..." The errors pointed to a homogeneous pool containing only 15k SAS drives and would occur around the same time every morning.  By default, FAST VP is scheduled to run starting at 11pm for 8 hours until 7pm.  This of course can be modified but I wouldn't advise it as relocating data on a production array should occur during off hours.  Here's an example of the SP event log error that shows relocation failures:


After lots of digging, I found a Primus article (emc274840) on EMC's support forum that indicates a storage pool with not enough free space will cause data relocation errors.  Aside from the recommendations in this Primus article, there are no other published figures (that I could find) stating how much free space to leave in a storage pool to allow for FAST VP to operate efficiently.  I verified the pool in question on my 5300 was indeed 99.9% utilized and did not have the free space needed for the data slices to be relocated within the pool by FAST VP.  In a nutshell, EMC recommends a minimum amount of free space within a FAST VP enabled pool in order for data to be relocated within the pool.  In FLARE version 30 and 31, a minimum of 10% free space should be left in the pool, unallocated by user LUN's.  In version 32 (Inyo) EMC recommends leaving 5% free in the pool for slice relocations.

If you find yourself in a situation where you are inside the minimum free space threshold for a pool and encountering relocation errors, you'll have to either use LUN migration to move LUN's out of the pool into a lesser utilized pool, or add more disks to the pool to expand the capacity.  If you happen to have LUN's in the pool that are presented to Windows 2008 or 2012 hosts, you can use the LUN shrink option to free up space in the pool as well.  LUN shrink is supported on the VNX, but will only work on Windows 2008 or 2012 host operating systems.