This week’s VMware Communities Roundtable featured one of my favorite people at VMware, Paul Manning who spoke on thin provisioning. This topic is the inspiration for today’s VCE post. Reducing storage costs is top of mind and it’s gonna take some time covering thin provisioning so let’s begin!
Virtualizing a Datacenter
I am not exaggerating when I share with you that every customer I meet with has elaborate plans to virtualize most, if not all, of the Intel based servers within their data center, yet deploying this new architecture is challenged to move from a portion to a majority of the footprint due to storage costs (both CapEx and OpEx). I’d like to summarize the message from these meetings for you, which goes something like this…
“We love server virtualization! With VMware we’ve reduced our server footprint and are targeting more systems to virtualize.”
“We love converged Ethernet! With Cisco we’ve reduced our port counts and are targeting to have a single platform for voice, data, and user access.”
“Storage costs are out of control, can you help? Prior to virtualization we roughly ran 20%-30% of our servers on shared storage arrays. Virtualization is forcing every VM onto shared storage. What do we do?”
Colour me weird here, but consistently hearing customers facing the same challenge in meeting after meeting inevitably gets an old Sesame Street song stuck playing in my head…
One of these things just doesn’t belong,
Can you tell which thing is not like the others
By the time I finish my song?
In Order to Get Off the Ground this Cloud Needs to Lose Some Weight
It’s no secret, these escalating storage costs are stalling the deployment of 100%, or mostly, virtualized data centers. Don’t take my word for it; note the number of technologies VMware has developed and supports in order to help customers reduce the costs associated with storage connectivity (iSCSI & NFS) and storage consumption (linked clones and VMDK thin provisioning).
As one of VMware’s key storage partners NetApp is ‘all in’ in our support of reducing the storage footprint for our customers. For today’s post we’re going to focus on the technical details on and around deploying thin provisioning so you can feel confident in your decisions on when and where to leverage this technology.
Anatomy of a Virtual Disk
I can’t think of a better place to begin this post than by reviewing the three types of virtual disks available in vSphere including their similarities, differences, weaknesses, and strengths.
– The Thick Virtual Disk
This is the traditional virtual disk format most of us have deployed with most of our VMs. This format preallocates the capacity of the virtual disk from the datastore at the time it is created. This format does not format the VMDK at the time of deployment. This means that data, which needs to be written, must pause while the blocks required to store the data are zeroed out. The operation only occurs on demand at anytime an area of the virtual disk, which has never been written to, is required to store data.
– The Thin Virtual Disk
This virtual disk form is very similar to the thick format with the exception that it does not preallocate the capacity of the virtual disk from the datastore when it is created. When storage capacity is required the VMDK will allocate storage in chunks equal to the size of the file system block. For VMFS this may be between 1MB & 8MB and for NFS it will be equal to the block size on the NFS array. The process of allocating blocks on a shared VMFS datastore is considered a metadata operation and as such will execute SCSI locks on the datastore while the allocation operation is executed. While this process is very brief, it does suspend the write operations of the VMs on the datastore.
Like the thick format, thin VMDKs are not formatted at the time of deployment. This also means that data that needs to be written must pause while the blocks required to store the data are zeroed out. This operation occurs on demand anytime an area of the virtual disk, which has never been written to, is required to store data.
To summarize the zeroing out and allocation differences between a thick and thin virtual disk just remember both will suspend I/O when writing to new areas of disk which need to be zeroed, but before this can occur with a thin virtual disk it may also have to obtain additional capacity from the datastore.
– The Eager Zeroed Thick Virtual Disk
This virtual disk form is similar to the thick format as it preallocates the capacity of the virtual disk from the datastore when it is created; however, unlike the thick and thin formats an eager zeroed thick virtual disk actually formats all of its data blocks at the time of deployment. This virtual disk format does not include or require the allocation and zeroing on-demand processes.
Characteristics of Virtual Disks
– Thick and Thin Virtual Disks consume the same amount of storage on a storage array
This point usually catches most by surprise, but it is true. If you deploy VMFS datastores on thin provisioned LUNs the data stored in thin and thick VMDKs is the same. Remember, both formats don’t format their blocks until data is required to be stored.
– Application and Feature Support Considerations
There are several use cases which require the virtual disks to be converted to either thick or eager zeroed thick. For example VMs configured with MSCS or Fault Tolerance (note you cannot run MSCS in FT) are required to be eager zeroed thick. Now don’t fret this one, thick and thin virtual disks can be ‘inflated’ at any time to the eager zeroed thick format.
I would note that the inflation process is completed directly on the VMDK from within the datastore browser, and not as a part of the VM’s properties.
Understanding Thin – From a Day to Day Perspective
– Thin is only thin on day one
As discussed, thin provisioned virtual disks reduce used storage capacity by not preallocating allocating storage capacity from the datastore and storage array. As one would expect, the size of a VMDK will increase over time; however, the VMDK will be of a greater capacity than the data which can be measured from within the GOS.
Many customers are surprised to discover that the VMDK grew beyond the capacity of the data which it is storing. The reason for this phenomenon is deleted data is stored in the GOS file system. When data is deleted the actual process merely removes the content from the active file system table and marks the blocks as available to be overwritten. The data still resides in the file system and thus in the virtual disk. This is why you can purchase undelete tools like WinUndelete.
– Avoid Running Defrag Utilities inside of VMs
Many of us have operational processes in place which include the execution of file system defragmentation utilities on a regular basis. Speaking for NetApp, this process should not be required once you have migrated a system from direct attached storage to a shared storage array.
The recommendation to not run defrag utilities should be considered when deploying thin provisioned virtual disks as the defragmentation process results in the rewriting the all of the data within a VMDK. This operation can cause a considerable expansion in the size of the virtual disk, costing you your storage savings.
Understanding Thin – From an Availability Perspective
– The Goal of Thin Provisioning is Datastore Oversubscription
As we discussed, the only thin provisioned virtual disks are able to reduce the capacity consumed within a datastore and as such it is the only format which one can oversubscribe of the storage capacity of the datastore. On the surface this may sound like a very attractive option; however, it has a few limitations which you must know before you implement.
The challenge to oversubscribing a datastore is that datastore, and all of its components (VMFS, LUNs, etc…) are static in terms of storage capacity. While the capacity of a datastore can be increased on the fly, this process is not automated or policy driven. Should an oversubscribed datastore encounter an out of space condition, all of the running VMs will become unavailable to the end user. In these scenarios the VMs don’t ‘crash’ the ‘pause’; however, applications running inside of VMs may fail if the out of space condition isn’t addressed in a relatively short period of time. For example Oracle databases will remain active for 180 seconds, after that time has elapsed the database will fail.
Note: The out of space condition can occur even if there is used, yet free space in the VMDK if the write is attempting to allocate and zero a free block for the write operation. To be clear, this condition can occur with virtual disks, LUN, and network filesystems.
Ensuring Success with Thin provisioned Virtual Disks
– Ensuring Storage Efficiency
Have I scared you away from deploying thin provisioned virtual disks? My apologies if I have, that isn’t my intention. With any technology there are trade offs, and I wanted to ensure you were well informed, now let’s focus on tackling some of the scary scenarios I have covered.
As we covered earlier GOS file systems hold onto deleted data and this process unintentionally expands the capacity of a thin VMDK. This occurs naturally as file systems age and at an accelerated rate with the running of defrag utilities. You may be surprised to know that with a little bit of effort you can remove the deleted data from the VM file system (i.e. NTFS or EXT3) and reduce the size of the virtual disk.
In order to accomplish this feat there are two phases one must complete. First is to zero out the ‘free’ blocks within in the GOS file system. This can be accomplished by using the ‘shrink disk’ feature within VMTools or with tools like sdelete from Microsoft. The second half, or phase in this process, is to use Storage VMotion to migrate the VMDK to a new datastore.
You should note that this process is manual; however, Mike Laverick has posted the following guide which includes how to automate some of the components in this process. Duncan Epping has also covered automating parts of this process.
– Ensuring VM Availability
Before one places an oversubscribed datastore into production one needs to deploy a mechanism that ensures the datastore never fills. By leveraging a VMware alarms, Storage VMotion, and a little bit of scripting knowledge we can create a datastore capacity monitor and automated migration tool which can ensure the availability of our VMs.
Again we luck out here as Eric Gray at vCritical has shared his expertise and skill by documented a process to implement such a solution.
Remember There’s No Such Thing as a Free Lunch
In this section I have shared with your how the community of VMware experts have begun addressing the hurdles around deploying thin provisioned VMDKs. While these solutions are very solid, there is an aspect that appears to be overlooked; is there an impact the storage array and it related activities?
As you have seen we have been able to overcome some hurdles by implementing scripts that primarily leverage the capabilities of Storage VMotion. This is an area we need to dig into. As Storage VMotion copies data from one datastore to another we need to highlight that the original datastore will still contain the original data of the VMDk. Remember deleted data is still data. As with a GOS file system, the blocks are free but the storage array is still storing the deleted data. These blocks can be reused, but only as they are overwritten by another VM in the original datastore. These blocks are not returned to a global pool for reallocation in another datastore.
Also, the addition of these Storage VMotion process do place additional load on the storage array, storage network, and ESX/ESXi hosts. Please consider scheduling your activity accordingly based on business demands. I realize this may not be possible if failure to immediately migrate a VM will result in multiple VMs becoming unavailable.
My final precaution, should any of the Storage VMotion migrated VMs be included as a part of a SRM recovery plan, their migration to a new datastore will require them to be replicated again, in their entirety, to your DR location. So as in the last paragraph, please schedule your plans accordingly and ensure you have the bandwidth to complete these ‘re-baseline’ operations in any windows you may be assigned for completing these tasks.
In Closing
The storage savings technologies made available by VMware are truly revolutionary as they deliver savings that were unheard of with physical servers and add capabilities not found in most storage devices. As with any technology there are deployment considerations to take into consideration prior to their use and I hope we have covered most of them around thin provisioning.
As I stated at the beginning of this post, this is part 1 of 2. What I have covered today is the basics of how VMware’s thin provisioning works on any storage array from any vendor. Tomorrow I will post part 2 where I will cover technical enhancements, some in the form of the VAAI program and some propriety to NetApp’s Data ONTAP. I will share how these offerings enhance the storage savings efforts of thin provisioning by either eliminating or greatly simplifying a number of the listed hurdles.
Until tomorrow, good night, and remember, Virtualization Changes Everything!
This makes me want to go write a script to run weekly on all my VM’s that kicks off the “shrink disk” feature in Tools.
I’m using a lot of thinprov vmdk’s within thin provisioned shared storage.
Nice writeup, Vaughn. 🙂
Disclosure EMCer here…
Minor FYI, Vaughn – it’s not a typo in the EMC slides you used, Thick and “Zeroedthick” are actually the same thing (the command line arguments for creating vmdks of different types, the GUI simply shows zeroedthick as “thick”)- the three vmdk types (being explicit) are actually; thin, zeroedthick, and eagerzeroedthick
Thanks for the update Chad. I’ll correct. I want to be clear so readers don’t mistake eager zeroed thick with thick (with zerothick).
I’d like to add some data as a vendor of defrag solutions. I agree that defrag of thin provisioned disks/volumes need be weighed against the benefits. Blind application or dismissal of defrag should not be undertaken without due diligence. Most proprietary third party defrag solutions apply advanced algorithms that minimize “movement” and thereby unnecessary expansion. That said, as storage technologies advance, so must the accompanying solutions. To that end, there are now offer file system drivers that prevent most fragmentation from occurring, at the file system, built for these types of environments. That means no need to move data around after the fact; the factor that contributes to unnecessary growth of virtual disks. Please contact me and I will provide you NFRs.
Thank you for one of the wonderful blog post on VDI.
We ran into an similar issue mentioned in your article with Thin provisioned volumes.
“As discussed, thin provisioned virtual disks reduce used storage capacity by not preallocating allocated storage capacity from the datastore and storage array. As one would expect, the size of a VMDK will increase over time; however, the VMDK will be of a greater capacity than the data which can be measured from within the GOS.
Many customers are surprised to discover that the VMDK grew beyond the capacity of the data which it is storing. The reason for this phenomenon is deleted data is stored in the GOS file system. When data is deleted the actual process merely removes the content from the active file system table and marks the blocks as available to be overwritten. The data still resides in the file system and thus in the virtual disk.”
The current concern is about how to get back the space occupied by deleted files. We tested “sdelete” and does not seem to help.
“Storage vMotion” may the right option but there are close to 3000 desktops and it is a tough task to do.
Appreciate any thoughts and ideas from you on getting back the deleted space.
Along with that issue with “how to recover deleted space” within the vmdk, there is also the issue as to how to recover blocks that WAFL has no idea have been marked as “free”. I’m hoping for a Space Reclaimer for VMFS to be included in some sort of vStorage plug-in – maybe the Virtual Storage Console?
http://communities.netapp.com/message/18848#18848
There’s a lot going on in this space which I cannot comment on at the moment. If you have a NetApp partner or account team maybe we can set up a NDA discussion?
Hi Vaughn,
I posted a response to your comment at http://vpivot.com/2010/02/12/windows-guest-defragmentation/
Please email me to discuss in detail.