Recreating VCF Managed vSAN Disk Groups
This post discusses incorrectly tagged disks for all-flash vSAN, on hosts that are part of a VCF domain. When creating or expanding a VCF domain that is backed by vSAN, it automatically configures the vSAN disk groups on the hosts being provisioned. This article will refer to these as ‘VCF vSAN Disk Groups’.
vSAN Disk groups are created based on how the disks are tagged. Within a disk group, there is a cache tier and a capacity tier, if a disk is meant to be a capacity drive, it must be marked as being one.
The process to achieving this can be found here.
How does this situation occur?
I have mostly seen this issue occur when a host is being repurposed, rebuilt or reconfigured in some way. There may be other causes for this to occur, however, I have not spent too much time testing the how, but have been more concerned with resolving it after the fact.
Effects of misconfigured drive tagging
If you are standing up or adding a host into a cluster and the drives are not tagged the same as the rest of the hosts, this will impact your total storage for the vSAN datastore, not to mention your cache tier is usually more robust and quicker than the capacity tier, so will also have performance impacts.
Identifying the issue
The quickest way to identify if this is an issue is to check vCenter. The image below is a host with the correct configuration, the cache tier is the smaller drive, and the capacity tier consists of two larger drives.
Now compare the above image with a host that has been misconfigured.
Notice on vcfesxi1 that the cache tier is a larger disk an the capacity tier is what is supposed to be the cache disk?
The following image shows the incorrect tag on vcfesxi1 for the cache disk.
The last disk is the smaller cache disk, and “IsCapacityFlash” is marked as 1 or true. Because of this, when the vSAN disk group is created, it uses this disk as part of the capacity tier and not the cache tier. The image below is the same output on a correctly configured host.
Here you can see the smaller cache disk is not tagged as being a capacity flash disk, however, the two larger disks are.
Resolving Misconfigured VCF vSAN Disk Groups
To resolve this issue, the disk group has to be removed from the host, which means all the data off of it must be migrated or else it will be deleted. I have also checked what VCF (SDDC Manager) keeps a track of, and vSAN Disk Groups and UUIDs are not one of them. So performing this fix should not cause you any problems. As with anything on the internet, seek advice from support if you feel it is right.
First navigate to vSAN Disk Management.
After clicking remove, you will be presented with the below screen.
Click on “GO TO PRE-CHECK”, and if it is successful, you should see something similar to the below.
From this screen, click on remove and it will prompt you one final time that the data will be deleted unless migrated.
This process can take a while, depending on how much data was used, and performance of your infrastructure. Once it is complete, navigate back to vSAN Disk Management, and select “CLAIM UNUSED DISKS”.
Ensure the correct disk is being selected as the cache tier, after clicking create the disk group will be created. As part of this, the disks are correctly flagged as either being capacity or cache.
And that’s it, the misconfigured VCF vSAN Disk Group is now fixed!
For other VCF tips, view The Unofficial VCF Troubleshooting Guide!