Last week I published the post ‘Data Compression, Deduplication, & Single Instance Storage’ as a means to raise awareness around these different types of storage savings technologies and to provide some guidance as where each is best suited to be deployed.
I’d like to thank all that chimed in to share their thoughts and knowledge in the comments section of this post. I believe we can chalk up the discussion as a win for the readers.
It appears the last post led a few to revisit data deduplication with their NetApp arrays as I recieved a number of email and direct messages where the sender was (pleasantly) surprised to discover that Data Ontap 7.3.3 includes data compression to the list of the integrated storage savings technologies. (Storage features available can be view via the license command when connected to the FAS console).
Similar to ‘dedupe’, ‘compression’ will be supported with any NetApp controller, data set, and any storage communications protocol including SMB (CIFS), NFS, FTP, HTTP, FC, FCoE, iSCSI, et al. Beyond this flexibility compression can be combined with data deduplication for both SAN and NAS data sets.
NetApp is leading storage innovation. No other storage vendor can provide this level of storage savings.
Availability of Data Compression
One of the questions many have asked has been, ‘Why hasn’t NetApp announced data compression in Ontap?’ This is great quesiton which I felt was easier to share and answer with the community rather than reply to individually.
The official release which will provide support for data compression is currently targeted for the 8.0.1 release of Data Ontap. I have spoken with both the Product Manager for compression and NetApp’s own Dr. Dedupe (aka Larry Freeman) and they have shared that both have confirmed that there is no active pre-release program available for those interested in data compression with 7.3.3. Many NetApp customers are familiar with our pre-release program, which is known as PVR.
One item that I can share with you, is when DOT 8.0.1 releases, data compression will be a no-cost (or free) software license (as data deduplication is today).
For Those Planning To Enable Data Compression
For those of you considering enabling data compression with VMware, hyper-V, KVM, etc. I’d suggest that you hold off on your plans until after you have validated the technology on unstructured and archival data sets.
Some you may be asking, ‘Why such a conservative recommendation?’
To be frank, we didn’t target the use of data compression with virtual machines as our customers are currently recieving tremendous savings from data deduplication along with perfomrance gains via Flash Cache (formerly PAM) and it’s inherent TSCS capabilities. Customers commonly realize savings of 50% to 70% (and sometime greater) in virtual server environments and around 95% virtual desktop deployments.
Due to the level of success we have had with VMs we have directed our engineering efforts towards data sets where we wanted to deliver greater savings. It is for this reason why I would suggest enabling data comprssion in areas like home directories, engineering data sets, data archives, external blob storage with SharePoint Server, etc.
My guidance to you would be to ‘dedupe’ every data set today. Doing so reduces primary and secondary storage requirements and bandwidth for disk-to-disk replication. There’s litterally no reason to not enable his capability. With the release of 8.0.1 data compression will be generally available and will provide additional storage savings with your unstrucutred data sets.
When we ship 8.0.1 and you begin to adopt data compression, please share your results! We love the feedback.
For those of you still uncertain about how much of information around data deduplication, TSCS, and data compression is fact and how much is fiction, I’d suggest you register to attend VMworld 2010 because we are going to have many of these technologies on display, including performance validations, and technical demonstrations.