Deduplication Guarantee from NetApp – Fact or Fiction?

-

I stumbled across these blog posts this morning.

NetApp’s ‘Shining’ Moment – its Capacity Guarantee Program &

NetApp’s ‘Shining’ Moment – its Capacity Guarantee Program follow up

hp-blog.jpg These blogs attempt to discuss the details of the NetApp 50% Virtualization Guarantee Program. This program is one of the many means by which NetApp is helping customers reduce their storage spenditures and carbon footprint by reducing the amount of storage required to run their virtual infrastructures.

The blog’s author, Jim Haberkorn, made the point that the guarantee program includes a requirement that compares NetApp RAID-DP against traditional storage systems running RAID 10.

I believe the point Jim is trying to make in his posts is, ‘RAID-DP requires almost half the physical disks as RAID 10 so one should conclude that the NetApp data deduplication technology works poorly and thus the guarantee program is worthless.’

Let’s remove RAID 10 and investigate how well dedupe works

Let us take RAID 10 out of the discussion and review customer experiences with VMware running on dedupe in their production environments. A one-minute search on Google returns the following examples that I would ask you to consider if NetApp can make an impact in reducing your storage footprint.

Example 1: Duke Institute for Genome Sciences: 83% storage savings with VMware

Example 2: Virginia Credit Union: 78% storage savings with VMware

Example 3: Burt’s Bees: 50% storage savings with VMware

Examples 4, 5, 6 & 7: Multiple customers replying to Scott Lowe’s blog post : 64%, 83%, 40+%, and 70% respective storage savings with VMware as reported in the replies section of Scott’s blog.

The results are in

Customers are commonly saving 50% – 80% of their production storage requirements for VMware with NetApp deduplication. These savings were calculated based on the reduction of storage on the NetApp array. There is no comparison of RAID 10 in these calculations, and they best part is all of these arrays are protected from double disk failures with RAID-DP.

The world used to run on RAID 5

I commend Jim for reading the fine print of the NetApp storage savings guarantee program. If your Virtual Infrastructure is running on RAID 5 and you believe you have the appropriate level of data protection than of course this guarantee looks outrageous.

RAID 5 was designed to be the lowest cost of data redundancy for individual servers. This low level of data protection works well if the data being protected is the data for a single server; however, VMware builds multi-node clusters and clouds of hundreds of virtual machines.

Personally I would never deploy more than a single virtual machine on a form of RAID that did not provide protection from a double disk failure. The risk is just not worth the cost savings, but I digress…

Back to the guarantee

So we began discussing that the NetApp guarantee is based around RAID 10 and that Jim felt this requirement into the category of ‘a bait and switch tactic’. I’ll give Jim that NetApp is being very conservative with their guarantee program. NetApp has always been conservative company and that makes an enormous amount of sense from a vendor who provides data storage and protection solutions.

I’d ask you the reader to consider the examples that I have listed above, examples whose savings are factored without any consideration of RAID10, and decide for yourself. Who is more believable customers using dedupe or NetApp competitors who don’t offer dedupe?

Maybe Jim or others who have replied to his blog can state how HP is working to reduce their customers’ storage footprint. While you may not like a conservative guarantee, where’s the guarantee from the rest of the storage industry?

Vaughn Stewart
Vaughn Stewarthttp://twitter.com/vStewed
Vaughn is a VP of Systems Engineering at VAST Data. He helps organizations capitalize on what’s possible from VAST’s Universal Storage in a multitude of environments including A.I. & deep learning, data analytics, animation & VFX, media & broadcast, health & life sciences, data protection, etc. He spent 23 years in various leadership roles at Pure Storage and NetApp, and has been awarded a U.S. patent. Vaughn strives to simplify the technically complex and advocates thinking outside the box. You can find his perspective online at vaughnstewart.com and in print; he’s coauthored multiple books including “Virtualization Changes Everything: Storage Strategies for VMware vSphere & Cloud Computing“.

Share this article

Recent posts

11 Comments

  1. We have 24 VMs on NFS storage, and none of them have been properly aligned, yet we are getting 55% dedupe on it. We would need to have bought another shelf in order to hold all those VMs, and therefore are very happy with the results. We would get significantly more dedupe, but we keep quite a few snapshots in history and our change rate is not minimal. Props to Netapp for a solid dedupe product for primary storage.

  2. Why don’t you offer the guarantee on competitors RAID 5+one disk then. Obviously RAID 10 is going to offer far superior performance to RAID-DP, so its not a fair comparison.

  3. @James, we do offer the guarantee for competitors RAID 5+1, its called the V-Series 35% guarantee.
    The reason we chose to compare vs RAID-10, is that we have to compare against a configuration with similar performance and reliability characterstics (i.e. we’re not going to compare vs a bunch of 2TB SATA disks with no RAID protection)
    you’re right comparing vs RAID-10 isnt really fair, but not for the reasons you state.
    First, for random I/O workloads, especially write intensive ones, RAID-DP consistenly outperforms RAID-10 on a spindle for spindle basis on the same class of storage controller. This has been shown over and over again in many published benchmarks like SPC-1, SPEC-SFS, and Exchange MSRP (not really a benchmark, but still useful for this purpose). If you’re really interested, post a reply in these comments, and I’ll update this with all the appropriate links.
    For typical medium to large VMware environments (40+ spinldes) RAID-DP provides 5×9’s of availability all the way down to the LUN. RAID-10 barely manages 3×9’s (There’s an IBM whitepaper on this, again I can post the link if you’re really interested).
    So, while RAID-10 doesnt perform as well, isnt as reliable, and has much worse utilisation than RAID-DP, its the closest thing rest of the industry has. As unfair as it is to us to compare RAID-DP to vastly inferior RAID-10 technology, its still the best comparison there is.
    Regards
    John Martin

  4. Hmmm. I thought A-SiS is the best feature ever created on a filer. $1.5B was just for the heck of it? At least you now have great, purpose-built de-dupe software. Happy spinning.

  5. Hi
    I’m sorry I’m late with my posting, but I didn’t even know about this blog until, Alex, a co-worker of yours pointed me to it on his blog. BTW, I’m the ‘Jim’ you mention in your blog.
    Your blog response to my blog was professional and clear, but it missed the point of why I was criticizing the NetApp 50% capacity guarantee program. Now, it is a common debating tactic to re-state your opponent’s position in a slightly different way, thereby shifting it to an issue you feel comfortable defending. No hard feelings about that, but that’s essentially what you did. You turned my attack on the NetApp marketing program into an attack on the NetApp dedupe technology. And then you proceeded to defend the dedupe technology.
    Here is my criticism of the NetApp 50% dedupe capacity guarantee program:
    1) It compares the competitor’s RAID-1 against NetApp’s RAID-DP. If you do the math, (14+2 RAID stripe vs. RAID-1) that means that of the 50% guarantee 43% is achieved due to forcing a comparison vs. RAID-1. Just to make it clear: What I’m saying here is that of the capacity advantage implied by NetApp to be the result of dedupe, 86% is achieved simply due to RAID levels without any dedupe.
    2) NetApp excludes competitors’ RAID-5 from the comparison because it is not robust enough.
    3) It also excludes competitors’ RAID-DP because it’s not fast enough. Are you getting the picture here?
    4) It requires that 90% of the deduped data be extremely dedupe friendly. Here is a cut and paste from the NetApp program document: “There must be no more than 10% of the following data types under the program: images and graphics, XML, database data, Microsoft Exchange data, and encrypted data. Excludes workloads with high-performance requirements that require spindles; to be determined by SE/PS during sizing.” That doesn’t leave much, does it?
    5) Bottom line: Strip away the self-serving RAID-1 requirement, and NetApp is only willing to guarantee customers a 7% capacity advantage over non-deduped arrays even while deduping a blatantly dedupe-friendly data set.
    6) Finally, even with all these caveats, NetApp further insures it will never lose money on this program by requiring customers to purchase extra professional services before they can qualify.
    Now, if you dispute my facts about the program, please speak up. The program details can be found here: http://media.netapp.com/documents/wp-7053.pdf
    The issue here is not whether NetApp’s dedupe works or not. The issue is: Why would any company come out with a program like this?
    Best regards,
    Jim

  6. Jim,
    You are correct, I did change the discussion point to one which is of greater concern to customers and prospects.
    Run VMware, Hyper-V, or Xen Server on NTAP dedupe and your storage footprint will be greatly reduced.
    Don’t take my word for it. Like you I work for a vendor ,so instead Google it! Google VMware + NetApp + deudpe and only read the comments posted to blogs on the subject.
    Now, do we still want to break down the details of a marketing campaign? I believe you are guilty of promoting against a technology which isn’t available with LeftHand solutions.

  7. Yes, NetApp’s 50% Virtualization Guarantee really works. Here’s the output from our VMWare implementation
    Filesystem used saved %saved
    /vol/vmstorexxxxx0/ 354GB 344GB 49%
    Filesystem used saved %saved
    /vol/vmstorexxxxx1/ 498GB 174GB 26%
    Filesystem used saved %saved
    /vol/vmstorexxxxx2/ 128GB 246GB 66%
    Filesystem used saved %saved
    /vol/vmstorexxxxx3/ 183GB 215GB 54%
    Filesystem used saved %saved
    /vol/vmstorexxxxx4/ 92GB 166GB 64%
    Filesystem used saved %saved
    /vol/vmstorexxxxx5/ 47GB 40GB 46%
    Filesystem used saved %saved
    /vol/vmstorexxxxx6/ 49GB 118GB 71%

Leave a Reply

Recent comments

Jens Melhede (Violin Memory) on I’m Joining the Flash Revolution. How about You?
Andris Masengi Indonesia on One Week with an iPad
Just A Storage Guy on Myth Busting: Storage Guarantees
Stavstud on One Week with an iPad
Packetboy on One Week with an iPad
Andrew Mitchell on One Week with an iPad
Keith Norbie on One Week with an iPad
Adrian Simays on EMC Benchmarking Shenanigans
Brian on Welcome…