VCE-101: Deduplication: Storage Capacity and Array Cache

July 21, 2009Vaughn Stewart

Welcome to the first session in a series of blog posts entitled ‘Storage 101 – Virtualization Changes Everything’

Data deduplication was originally introduced in Data ONTAP 7.2 in July of 2006 as a technology that enhanced the NetApp Disk-to-Disk (D2D) backup offerings. Coincidentally VMware released VI3 around the same time in June of 2006, and shortly thereafter customers began using these two technologies together to virtualize their servers and storage footprint.

For today’s post we’re going to dive into the realm of storage efficiency and discuss data deduplication and it’s impact in production storage capacity and storage array cache specifically within a virtual data center.

The Format of this Series

As I stated when I decided to kick off these posts I want to share these topics with a wide audience and in order to do so each session will begin with a high level overview of the technology and its value followed by the technical details.

Overview

It’s no secret that one of the top two challenges with server virtualization is the cost associated with storage. The act of virtualizing and consolidating servers from direct attached storage onto shared storage arrays results in a dramatic increase in the capacity requirements of a much more expensive form of storage media.

Eliminating this hurdle will allow customers to virtualize more of their infrastructure.

By enabling data deduplication, customers typically reduce their production storage requirements for VMware by 50% – 70%. These cost savings can literally finance further virtualization projects such as implementing DR with Site Recovery Manager or virtualizing a desktop environment.

Data deduplication has a pervasive effect within an infrastructure. By reducing the production footprint less storage is required for backup, DR, and archival purposes. In addition dedupe provides WAN acceleration when replicating a data set as any block already stored on the destination won’t be resent just because a second VM contains the same blocks.

NetApp has extended this technology to also include the storage array cache. With Intelligent Caching deduplicated data sets actually outperform non-deduplicated data sets as the cache in the storage controller is effectively increased exponentially. There’s more on this in the technical details section below.

Data deduplication and Intelligent Caching are the cornerstones of NetApp’s storage efficiency technologies. Together they will reduce your storage costs, bandwidth requirements for replication, DR and backup archive media requirements while increasing array performance. Please don’t take my word for it, Google it! (I’d suggest skipping any vendor site and read blog posts and reader comments). Alternatively take us up on our 50% storage savings guarantee. If we can’t deliver the savings, we’ll provide the additional storage you require at no cost.

Let’s Get into the Technical Details

NetApp storage arrays and enterprise class arrays virtualized by NetApp vSeries controllers address storage quite a bit differently than traditional (or legacy) storage array architectures. The difference is NetApp abstracts the data being served from the physical storage or disk drives.

This abstraction begins by grouping the physical disks (and their available IOPs) into aggregates, which are comprised of one or more raid groups. An aggregate is divided into one or more Flexible Volumes. The FlexVol is the layer where WAFL is implemented, and it is WAFL that enables pointer-based functionality such as Snapshots; Replication; LUN, volume, and file Clones; and others including dedupe.

I believe the best place to start is by briefly explaining a NetApp Snapshot

With all storage arrays data is stored in blocks; however, with WAFL the blocks layer is abstracted from the physical block, thus allowing ONTAP to manipulate the access and presentation of a data object. Creating a snapshot is one of these manipulations. A snapshot preserves the state of data at a point in time by creating an inode map that is identical to the production inode table and by locking the blocks associated with the snapshot.

When the production data is modified it does not affect the snapshot. New data is written in free space, deleted data will remain as it is part of the snapshot, and the production inode table map is updated.

This concept is very similar to the database concept of a view.

The storage array stores two versions of the data, the current and previous state, but does not duplicate any of the data that is in common between these two versions.

Dedupe is Similar to Snapshots in Reverse

Take a typical VMware environment where you have consolidated a number of VMs into a datastore. Any 4kb block of data that is stored in more than one location, within a single VM or across all of the VMs in the datastore, can be reduced to a single 4kb instance.

Block level representation of two VMs and the content within their VMDKs

In a manner which is vary analogous to the Snapshot model dedupe allows multiple VMs to share the underlying storage blocks and only consume storage required by their unique storage blocks.

Block level representation of two VMs and the content within their VMDKs with dedupe enabled

Some examples of where redundant data is within your VMware environment:

Redundancy within a VM:

• In windows dlls are stored in multiple locations including their operational location, the windows\system32\dllcache folder, any hotfix or service pack uninstall folder, and redundantly within various application folders.

Redundancy within a datastore:

• There are multiple VMs running the same OS, hotfixes, services packs, and enterprise management applications such as antivirus and SNMP monitoring tools.

• There can be multiple instances of the same application being run on multiple VMs.

• It is common to have different applications, running on different VMs, which were created with the same program language, like C++, which will call the same dlls in their operation.

• Within every VMDK there is free (allocated but unused) space, all of these 4kb NTFS and EXT3 free blocks are identical.

• Every VM contains data that has been deleted, but still resides in the GOS file system and VMDK (this is where undelete tools operate).

Block level versus File level deduplication

I just want to point out that block level dedupe is significantly different than file level dedupe. As I stated in the previous section block level means any 4kb component within an object can be deduplicated. With file level dedupe the savings are only available with identical files that have been stored in more than a single location. Given this fact file level deduplication is useless with VMware or any virtualization technology, as an active VM cannot be identical to another active VM.

Intermission

Still with me? I warned you these posts were going to contain a fair amount of technical content.

Can You Really Reduce the Number of Disk Drives?

A physical disk drive delivers two resources: capacity and IO performance. The concept of reducing the number of spindles (or disks) required to serve a data set should raise the following question…

“If I reduce the number of disks in a data set, don’t I reduce the maximum IOPs available to that data set?”

The answer is yes, removing disk drives will negatively impact your performance, unless your storage cache is also dedupe aware.

I believe these three images summarize the evolution of storage arrays and the implementation of dedupe.

VMware on Traditional Legacy Storage Arrays

VMware on NetApp Dedupe Storage

VMware on NetApp Dedupe Storage with Intelligent Cache

When we plan to deploy I/O intensive applications, like Oracle, Exchange, or VDI, we must complete I/O sizing exercises in order to ensure the storage design can meet the performance requirements.

Consider the points of concern raised in this VDI sizing example:

• Assume a VDI user needs 5 IOPs (5 – 25 are normal) IOPs

• How many spindles needed for 1,000 users?

• How many spindles needed if you use NetApp dedupe and achieve 100:1 storage capacity savings?

• What percentage of IOPs can be served by the array cache?

• What IOPs do I need for I/O storms such as boot, login, offline sync, and A/V Scan?

The data below is the sysstat output from a mid-tier NetApp FAS 3000 series controller simultaneously running 1,024 deduplicated virtual desktops with Intelligent (or dedupe aware) Cache. During this test the array is averaging approximately 200 MBs of I/O output with roughly 5MBs of disk reads.

The data below is the ‘stats show’ output from the same array during this same I/O load. I’d highlight that the controller is consistently achieving 98% – 99% cache ratio. If you run VMware on a traditional legacy storage architecture, I’d suggest you to ask your storage admin for your cache hit ratio. I think you might be very surprise at the cache hit ratios you are not achieving.

This output clearly shows how Intelligent Cache can store a single copy of the data (OS, application binaries, etc…) required to run tens or hundreds of VMs.

Transparent Technology

Dedupe is an integrated component of Data ONTAP available with NetApp FAS, vSeries, and IBM N-Series arrays. It works with data being served over any storage protocol; FC, FCoE, iSCSI, NFS, CIFS, etc. and unlike some backup-based dedupe technologies; it does not require capacity based management servers or external tools.

Dedupe is transparent, meaning its use is unknown to the application set or end user. It can be enabled on a dataset by dataset basis, and it is not a one-way operation. Dedupe can be disabled on any dataset at anytime, provided you have enough storage capacity to write out all of the data to its logical size.

A Storage Hypervisor?

All storage vendors sell their arrays tiers and these tiers are typically defined by performance power and addressable (physical) storage capacity. The storage virtualization of data dedupe and IC are breaking this selling model as these technologies allow the capacity of the data being served to exceed the physical storage capacity limits of the storage controllers.

In addition the amount of IC can be modularly increased within the mid-tier and high-end FAS array models via the Performance Acceleration Modules. Today these modules are available in 16 GB increments.

I’d expect to see much more advancements in this area in the future.

The ability to abstract the data being served from the physical limits of the disk drives which house this data leaves me wondering if Data ONTAP should be referred to as a Storage Hypervisor? I’ll ponder this thought, and any comments on this concept for possible further discussion.

In Closing

As I stated in the overview, Data Dedupe and Intelligent Caching are the cornerstones of NetApp’s storage efficiency technologies that will reduce your storage costs, bandwidth requirements and power consumption while increasing the overall performance of the storage array.

I hope you have found the information I’ve shared informative and in support of the claims made in the overview. I look forward to your feedback.

This wraps up our first session of ‘Storage 101 – Virtualization Changes Everything’. Please check the syllabus for session updates as they are subject to change.

One thought on “VCE-101: Deduplication: Storage Capacity and Array Cache”

Eric Forgette says:

July 22, 2009 at 8:46 am

The first time I heard ‘Virtualization Changes Everything’, I naively thought it referred to server virtualization. I think “Storage Hypervisor” is spot on. Storage virtualization offerings from other vendors really emulate, they don’t virtualize. Data ONTAP virtualizes storage. Great post, I’m looking forward to the next topic!

VCE-101: Deduplication: Storage Capacity and Array Cache

VCE-101: Deduplication: Storage Capacity and Array Cache

If you found this article valuable, consider sharing...

Related

Vaughn Stewart

One thought on “VCE-101: Deduplication: Storage Capacity and Array Cache”

Leave a ReplyCancel reply

Related Posts