I expect most of my readers are familiar with Transparent Page Sharing (TPS). For those in need of a refresh, TPS increases the over all performance capabilities of an ESX/ESXi cluster by ensuring efficient use of server memory. The TPS process achieves this efficiency by eliminating redundant virtual machine memory pages by remapping any duplicate reference back to a single page. The real ‘magic’ in TPS is its ability to transparently share the same pages (aka data) between multiple VMs.
Transparent Page Sharing is a key enabler in allowing the hypervisor to scale memory resources beyond the physical constraints.
This week I’d like to introduce you to Transparent Storage Cache Sharing (TSCS) and how it significantly changes the storage design and architecture in virtual infrastructures by allowing a single block of data to be accessed transparently by multiple virtual machines.
In this series, I will introduce you to the details of TSCS and review the benefits it delivers in a number of real-world use case scenarios.
A Bit of Background
TSCS is not a new NetApp technology; rather it is a term my team finds very effective in order to communicate the perfomance gains provided by the dedupe awareness delivered in our storage platforms and Performance Acceleration Modules.
In the past we have been referring to these capabilities as ‘Intelligent Caching.’ In hindsight this was a poor choice of words. ‘IC’ lacked clarity around the benefits customers recieve from NetApp arrays. Considering the term further it is interesting to note that every storage controller implements some form of advanced caching algorithms in an attempt to increase overall performance and as such could be labeled as providing intelligent caching.
Why Transparent Storage Cache Sharing is Unique
TSCS allows a block of cached data to be accessed by multiple external requests for this or a subset of said data. These references can be in many forms ranging from VM images, user data sets, and high performance enterprise applications.
The enabling components of TSCS is the ability within Data ONTAP to deduplicate storage objects (files, LUNs, volumes) and to create zero costs clones of storage objects (files, LUNs, volumes). These storage savings technologies often get ‘parroted‘ quite regularly by some of the vendors offering traditional storage array platforms. For the sake of this discussion I’d like to differ any comparisons around storage savings technologies for a future post where we can spend the appropriate attention required to discuss these technologies inn greater detail.
Suffice say, even if a traditional storage array platform provides storage savings capabilities, none of these arrays offer Transparent Storage Cache Sharing (aka dedupe-aware storage cache). As such these legacy architectures do not provide a means to reduce the IOP requirements of the underlying disks.
You need to be very careful when enabling storage savings technologies on traditional array architectures. Doing so introduces the possibility of many real-world scenarios where the IOP requirements may not be able to be met by the underlying disks.
This is the reason why TSCS is very different from these legacy architectures. By providing multiple references to sub file objects in cache, TSCS delivers the following benefits:
These benefits are easier to understand when viewed in context. As such we will review the impact of TSCS in all of these use cases as we progress through this series.
Traditional Use Case #1 – Serving Virtual Machines
Hosting a virtual infrastructure requires the binaries accessed by each VM to be considered a separate I/O request from the perspective of the ESX/ESXi host, storage network, array cache and array disk even if the data being requested is identical.
In this oversimplified image the upper half represents an ESX/ESXi cluster; the lower half represents the storage array, including cache and disk. I’ve attempted to clarify the point of isolated I/O by identifying and numbering each I/O request.
As you can see, even though the I/O requests are mainly identical, there is no logic available on the storage platform to understand the request, optimize the I/O transfer, and conserve array resources.
TSCS Use Case #1 – Serving Virtual Machines
As stated above, hosting a virtual infrastructure requires the binaries accessed by each VM to be considered a separate I/O request from the perspective of the ESX/ESXi host and storage network; however, with TSCS the storage array cache understands when an I/O request for the an identical storage block and in these cases the NetApp array can serve the redundant requests from the original data residing in cache.
In this image I’ve attempted to demonstrate the reduction in the amount of I/O requests made to the disk subsystem. By eliminating the load on the disk drives in serving redundant data optimizes the I/O process, reducing the number of I/O requests served by disks, and results in increasing the amount of IOPs being served per physical disk drive.
I believe scaling a resource beyond its physical limits is a key enablement of virtualization. What about you, do you share my view?
Wait a Minute, this Technology Sounds Familiar
Some will suggest there are similarities between the NetApp TSCS architecture with what is available with VMware’s Linked Clone technology. At a high level these observations are accurate while also being very misleading. Linked Clones are an incredible enabling technology which delivers a tremendous amount of value into traditonal storage arrays.
For purposes of this discussion, one needs to understand that VMs deployed as Linked Clones are not available for use with applications which require a persistent operating environment like virtual servers.
There are no limits with NetApp’s TSCS architecture. We can serve virtual servers, virtual desktops, or any data set without restricitons. For more information on how we deliver this ‘storage magic’ check out part 2 in this series where we will discuss TSCS with User Data, Microsoft Exchange Server, and Oracle Database Server.