I expect most of my readers are familiar with Transparent Page Sharing (TPS). For those in need of a refresh, TPS increases the over all performance capabilities of an ESX/ESXi cluster by ensuring efficient use of server memory. The TPS process achieves this efficiency by eliminating redundant virtual machine memory pages by remapping any duplicate reference back to a single page. The real ‘magic’ in TPS is its ability to transparently share the same pages (aka data) between multiple VMs.
Transparent Page Sharing is a key enabler in allowing the hypervisor to scale memory resources beyond the physical constraints.
This week I’d like to introduce you to Transparent Storage Cache Sharing (TSCS) and how it significantly changes the storage design and architecture in virtual infrastructures by allowing a single block of data to be accessed transparently by multiple virtual machines.
In this series, I will introduce you to the details of TSCS and review the benefits it delivers in a number of real-world use case scenarios.
A Bit of Background
TSCS is not a new NetApp technology; rather it is a term my team finds very effective in order to communicate the perfomance gains provided by the dedupe awareness delivered in our storage platforms and Performance Acceleration Modules.
In the past we have been referring to these capabilities as ‘Intelligent Caching.’ In hindsight this was a poor choice of words. ‘IC’ lacked clarity around the benefits customers recieve from NetApp arrays. Considering the term further it is interesting to note that every storage controller implements some form of advanced caching algorithms in an attempt to increase overall performance and as such could be labeled as providing intelligent caching.
Why Transparent Storage Cache Sharing is Unique
TSCS allows a block of cached data to be accessed by multiple external requests for this or a subset of said data. These references can be in many forms ranging from VM images, user data sets, and high performance enterprise applications.
The enabling components of TSCS is the ability within Data ONTAP to deduplicate storage objects (files, LUNs, volumes) and to create zero costs clones of storage objects (files, LUNs, volumes). These storage savings technologies often get ‘parroted‘ quite regularly by some of the vendors offering traditional storage array platforms. For the sake of this discussion I’d like to differ any comparisons around storage savings technologies for a future post where we can spend the appropriate attention required to discuss these technologies inn greater detail.
Suffice say, even if a traditional storage array platform provides storage savings capabilities, none of these arrays offer Transparent Storage Cache Sharing (aka dedupe-aware storage cache). As such these legacy architectures do not provide a means to reduce the IOP requirements of the underlying disks.
You need to be very careful when enabling storage savings technologies on traditional array architectures. Doing so introduces the possibility of many real-world scenarios where the IOP requirements may not be able to be met by the underlying disks.
This is the reason why TSCS is very different from these legacy architectures. By providing multiple references to sub file objects in cache, TSCS delivers the following benefits:
- Cache inefficiencies are eliminated resulting in an increase in cache capacity for servicing additional workloads
- Objects remain in cache for a longer duration as these blocks are referenced by more than one entity (such as a VM, user, or app). The longer an item remains in cache the greater the IO load is reduced from a disk drive.
- The initial reference of an object by an entity effectively results in pre-loading the storage cache for subsequent requests made by other entities. In other words, the more references the greater the performance.
These benefits are easier to understand when viewed in context. As such we will review the impact of TSCS in all of these use cases as we progress through this series.
Traditional Use Case #1 – Serving Virtual Machines
Hosting a virtual infrastructure requires the binaries accessed by each VM to be considered a separate I/O request from the perspective of the ESX/ESXi host, storage network, array cache and array disk even if the data being requested is identical.
In this oversimplified image the upper half represents an ESX/ESXi cluster; the lower half represents the storage array, including cache and disk. I’ve attempted to clarify the point of isolated I/O by identifying and numbering each I/O request.
As you can see, even though the I/O requests are mainly identical, there is no logic available on the storage platform to understand the request, optimize the I/O transfer, and conserve array resources.
(click on image to view at full size)
TSCS Use Case #1 – Serving Virtual Machines
As stated above, hosting a virtual infrastructure requires the binaries accessed by each VM to be considered a separate I/O request from the perspective of the ESX/ESXi host and storage network; however, with TSCS the storage array cache understands when an I/O request for the an identical storage block and in these cases the NetApp array can serve the redundant requests from the original data residing in cache.
In this image I’ve attempted to demonstrate the reduction in the amount of I/O requests made to the disk subsystem. By eliminating the load on the disk drives in serving redundant data optimizes the I/O process, reducing the number of I/O requests served by disks, and results in increasing the amount of IOPs being served per physical disk drive.
(click on image to view at full size)
I believe scaling a resource beyond its physical limits is a key enablement of virtualization. What about you, do you share my view?
Wait a Minute, this Technology Sounds Familiar
Some will suggest there are similarities between the NetApp TSCS architecture with what is available with VMware’s Linked Clone technology. At a high level these observations are accurate while also being very misleading. Linked Clones are an incredible enabling technology which delivers a tremendous amount of value into traditonal storage arrays.
For purposes of this discussion, one needs to understand that VMs deployed as Linked Clones are not available for use with applications which require a persistent operating environment like virtual servers.
There are no limits with NetApp’s TSCS architecture. We can serve virtual servers, virtual desktops, or any data set without restricitons. For more information on how we deliver this ‘storage magic’ check out part 2 in this series where we will discuss TSCS with User Data, Microsoft Exchange Server, and Oracle Database Server.
Trey Layton says
Nice post Vaughn
Vaughn quick question regarding the actual Caching algorithms you use. Within your customer base do you see a frequency of Read Cache being maxed out? All these technoligies sound really cool but from what I understand the algorithm would have to be predictive enough to take advantage of 1) the larger Cache size and 2) the DeDuped footprint in cache which theoretically would allow you to store even more data in Read cache. Is there a new caching algorithim in a new version of Ontap? Or was algorithm very good at predictive read chache so much so that when diagnosing performance problems in the past that you witnessed cache being 75-100% utilized and thus all these technologies would alleviate that. Point being that the SW needs to be good enough at pulling the right blocks into cache before being able to take advantage of larger DeDuped Cache sizes.
Aaron Chaisson says
Vaughn could you break this down a bit more? I’m not saying that you don’t have some secret sauce, but the way you described this is not clearly differentiated from how read cache works in general. Also, you’re pictures, though interesting at first glance, don’t explain what the array is doing differently as compared to the linked clone use case. In the case of VMware Linked clones, VMs 2-8 would point to VM1 which would point to a single cache instance and a single storage location, even on such “traditional” arrays as you refer to. Whether the snapshot/linkedclone/snapclone whatever you want to call it happens at the array level or the VMware level, the resulting impact on the back end drives, the array cache and the effective cache hit rate would be relatively the same given an equivalent amount of read cache. Read cache performance is all about fall through rates (how long a block stays in cache before it is timed out) and the likelyhood of any given IO requesting a block that is still in cache. Obviously the longer the fallthrough rate the mire likely that you will have a higher hit rate. The way to improve fall through rate is by either increasing locality of reference and reusing common blocks (helped by using linked clones or array replicas) or by blindly increasing read cache to basically just hold more stuff before it falls out of cache … not a bad strategy, but cache isn’t cheap.
If you are doing something special, cool, but I’m still not clear as to what that is and honestly … I would like to know.
Mike Slisinger says
Aaron, the key difference here is that TSCS provides this shared block caching to any deduplicated dataset. So while VMware linked clones have a rather specific use case, TSCS will work with any VM’s including permanently provisioned virtual servers.
Of course, this does not even take to account different types of datasets but I don’t want to steal Vaughn’s thunder…
Jonas Irwin says
I don’t see this as unique, unless I’ve missed something. This seems more like a post about why deduplication helps minimize cache overhead for read centric workloads. More specifally, for reads of the same blocks from many hosts, at the same time. There are lots of other great implementations of dedupe in the market that behave in an even more sophisticated manner..specifially, at a variable length..sub-block level.
Great post! It seems as if cache is where it is at for the near future. Having these large layers of cache (ie. PAM) enables compleaty new use cases for Storage. If cache like this is available in the VSeries (which I should know but can’t recall) basicaly that can pop into any storage environment as a transparent storage accellerator. With that said do you feel like the current cache strategy is a stop gap between complete SSD storage at higher capacties? (Looking years out not months) Yes, I am asking for a look into the crystal ball. Thanks for the post!
John Martin says
@Jonas – “There are lots of other great implementations of dedupe …”
Lots ? I can think of about two other dedup implementations that deserve the epithet of “great”, and neither of these are suitable for primary workloads, certainly not mission critical ones. As far as I’m aware, most of the variable block dedup caching/readahead algorithm are optimised for single threaded “sequential” reads of logically contiguous (though physically discontiguous) datasets not “reads of the same blocks from many hosts”.
@RDANIE01 “the algorithm would have to be predictive enough …”
This is where readsets come in really handy, Alex McDonald has blogged about them in the past.
Getting the most out of cache involves a collection of technologies working together, including, readsets, deduplication, and write optimised data layouts. Its the “little details” that end up making a big difference.
Consulting Systems Engineer
Vaughn Stewart says
@John – thanks for taking on these questions. I owe you one.
@Aaron – check out part 2 in the series. After you do, shoot me your questions.
@Jonas – yes cache helps read, and as one sees with larger arrays, the more cache the better the performance… and this is whats great about TSCS. We can provide more usable cache capacity beyond the physical capacity and as such provide greater performance. TSCS reduces disk requests for objects with blocks in common with another object, whether stored in the same or separate LUNs and VMDKs.
Some called it ‘magic cache’ 😉
@Creedom2020 – yes PAM and TSCS is available for 3rd party array via the vSeries.