I’ve been running hard lately on the treadmill of
progress lab work, but as the year is winding down I have a bit of time to blog. The singular thing that has consumed me for the past couple months is VVols. Andy Banta has done a very good job of laying out the ‘why’ of SolidFire’s VVol implementation here. Fair warning, it requires registration.
I’m not going to rehash what VVols is or how pieces of it work, but I did want to cover some of the unique or cool bits about SolidFire’s implementation. Everyone has the same framework to work against so SolidFire’s VVols implementation looks very similar to most other implementations on the surface. However, there four main areas where there is some special sauce in the mix.
The VASA Provider – I Have People Skills
In the VVols world that VASA provider is much like Tom. He takes the requirements from the customer (vCenter) and gives them to the engineers (SolidFire). Unlike Tom in Office Space, the VASA provider is a critical path piece of code. If it goes away then nothing changes, but the problem is, nothing can be changed. If the VASA provider is dead then you lose the ability to change power state for VMs, vMotion, sVMotion, snap, etc.
To protect the VASA provider SolidFire elected to implement it as a service running internally on the SolidFire cluster instead of relying on an external VM. Technically, the VASA provider is running on all nodes in a SolidFire cluster, but only the instance on the cluster master responds to VASA requests. This makes failover of the VASA provider extremely fast and resilient.
Protocol Endpoints – The SAN All Access Pass
PEs… all VVol implementations have them. However, most implementations have one or two for the entire array. SolidFire chose to map a PE for every node in the cluster. This is all in keeping with the scale out nature of SolidFire – if you add nodes you should be adding performance and capacity seamlessly. It’s hard to do that if you don’t scale out the access points along with the cluster.
At the smallest scale of four nodes that means you also have four PEs exposed to your ESXi hosts to push I/O through, so SolidFire is not likely to get log-jammed on PE queues. In vCenter these PEs show up as 512 byte devices and the number (and your performance) will automatically grow/shrink with the SF cluster.
Under normal circumstances, there will always be a one to one mapping of SolidFire nodes to PEs, but in some failure scenarios we can actually move a PE from one node to another. We also support I/O routing in cases where SolidFire has moved the VVol but has not had a rebind event to tell ESXi that the VVol is now on another PE (SF node).
Storage Containers – Contain Yourself
One of the cooler features (IMHO) about storage containers on SolidFire is that they are completely logical constructs. There is no physical ‘thing’ that is a storage container for SolidFire. This has some pretty cool ramifications.
- The storage container can dynamically resize (also enabled by the fact there is no file system with VVols –> no VMFS like extents to worry about)
- There is nothing to manage for a storage container –> you give it a name and you are done. No need to worry about partitioning a portion of the cluster for VVols.
- The storage container has access to all cluster resources –> when you add nodes to your cluster, your storage container gets bigger/faster automatically
- vCenter automatically sees increased capacity in the storage container when you add nodes to the cluster. You can see a demo of this below.
QoS – We Offer Only the Highest Quality Service
One of the hallmark features of SolidFire has always been the native QoS abilities of the platform. SolidFire QoS is set per volume, so in VMFS deployments this turned into a shared QoS budget for an entire VMFS datastore. In the VVols world, since every VVol (well most, but that is another story) is backed by a SolidFire volume we can now set QoS metrics on individual disks
Managing QoS per disk sounds like quite the nightmare – and it would be if you had to do it by hand. However, VMware has given us this nifty thing called storage policy based management (SPBM). SPBM allows us to programmatically set QoS via an SPBM policy.
When this policy is applied to a VM, the VASA provider will set the QoS min/max/burst values automatically. If a single policy is applied to the VM, then all VM data disks get the same QoS value. However, you can also apply multiple policies per VM in the cases where there are multiple disks involved.
This allows setting a per-disk performance guarantee for those business critical VMs that need a predictable level of performance day in an day out, no matter what is going on with the cluster. All of it managed by policy, handled automatically as part of the provisioning of the VM itself. If you need to change the performance, just edit the policy or apply a new one and QoS is changed instantly without having to vMotion the VM to another datastore.
So, if you have a SolidFire cluster, you are running Element OS 9.0+ and vSphere 6.0+ you should definitely kick the tires on VVols. Feel free to dip your toes in, you can run VVols and traditional VMFS side by side.