VSAN 6.0 – What’s New

October 25, 2021November 8, 2021Mark Warren

Among other things, VMware released Virtual SAN 6.0 earlier this month, in conjunction with vSphere 6.0. Realistically, this is a “2.0” release, but I am guessing they are calling it “6.0” to conform with the vSphere releases. I think that is a bad idea because it can put undue pressure on the developers to keep with the cadence of vSphere, vRealize and everything else that is taking on the “6.0” versioning. But the marketing people feel it is easier to manage compatibility questions.

Let’s face it, whether it is is called 2.0 or 6.0, it is still a “.0” release with plenty of new things to go awry.The good news is that VMware is finally saying that it is ready for Tier-1 workloads, so things look promising. This is good news since it is foundational for the latest in bullshit bingo buzzwords: “Hyperconverged Infrastructure Appliances” or HCIA. But HCIA is fodder for another post.

About two years ago, VMware acquired a storage hypervisor company called Virsto. Since then, VMware stripped away support for Hyper-V and baked the VirstoFS into vSphere. This is probably one of the reasons why VMware is saying they are ready for Tier-1. The new “VSAN FS” (I have also seen “VMFS-L”) provides some scalability and feature improvements over its predecessor.

Founded in 2007, Virsto was first released in 2010 and was acquired by VMware in 2013. In the beginning, Virsto was based on a few different purpose-built VMs. I hate that VMware baked Virsto’s functionality into the hypervisor. It is one more security risk, one more thing to go wrong, bloat in the hypervisor. I could get into an entirely separate conversation about either, but I like the idea of a streamlined hypervisor with purpose-built appliances for other services.

The pitch from Virsto was similar to the what most vendors pitch for all-flash or hybrid stage arrays. Their special sauce was to take a pot of random stew and serialize the writes for more efficient IO. With a resilient logging mechanism, acknowledgments are sent back to the host before the block gets re-ordered into a more efficient package and is sent to the final disk destination for persistence. This will deliver faster performance and less strain on the cells in the SSD.

Scalability Improvements

One thing I didn’t like about VSAN 1.0 was the sizing restrictions. The scalability improvements allow for clusters of up to 64 hosts, 200 VMs per host and 6400 VMs per cluster. The new format also allows for 62TB vDisks. Because of the new format, VMware is claiming to provide 40,000 IOPS per host in a hybrid configuration and up to 90,000 IOPS per host with all-flash configurations.

Incidentally, with the all-flash configuration, it is recommended to use a higher endurance SSD for caching (SLC or eMLC) and a lower endurance drive for persistence (MLC). I am guessing that the logs are kept in the cache disk.

Caveats

According to the Configuration Maximums Guide, you can have a maximum of 1 Failure to Tolerate with a vDisk greater than 16TB and up to 3 Failures to Tolerate for a vDisk that is equal to or less than 16TB. This is a design restriction and could cause heartache if you listen to just marketing!

What this means to you

Bigger, faster VMs in a more efficient storage environment. But using All-flash will cost more in licensing. Just watch the marketing speak when designing your system.

Availability Improvements

Several enhancements have been made as a result of the new format. These include the ability to create “Fault Domains.” This has also been called “rack awareness,” although VSAN really doesn’t know where it is living. But, now you can spread your cluster across racks or rows in a data center and create a fault domain based on the physical layout of your cluster. There is also a more granular control for servicing hardware. You can now evacuate data from a single disk or a disk group if replacement is needed. For safety, you can make the blinky lights turn red from the Web Client. Blinky lights may be considered an operational enhancement, but it is really about maintaining availability.

The new format allows snapshots and VM Clones to occur faster than before and you can have up to 32 snapshots per VM.

What this means to you

If you sustain a full rack or row failure, your data is protected and will restart via HA, on a host in a different rack or row. With control of the blinky lights, your hardware admin doesn’t screw things up worse. Faster snapshots and clones mean faster VM deployment and backup readiness.

Other Improvements

VSAN 6.0 now includes support for hardware encryption and checksum. If you have hardware that can encrypt data and it is on the HCL, you can use it in VSAN. You can now set a default Storage Policy Based Management (SPBM) policy for VSAN. Certain external disk enclosures are now supported. You can deploy VSAN across L3 networks. Data rebalancing and Health services dashboards are also included.

What this means to you

Encrypted drives mean encrypted data (at rest anyway.). Now administrators don’t need to select a specific policy for a VM to be placed on VSAN. You can use VSAN in blade environments if you wish. No need to have a dedicated L2 network. Better control of data load.

Conclusion

Although I don’t agree with the idea of baking additional services into the hypervisor, VSAN may meet your needs because it is fairly easy to set up and deploy. Be aware that you need to do some CLI magic to get it to work in an all-flash configuration. VMware is trying to make it even easier with the EVO:RAIL program. More on EVO:RAIL later.