Many of you have been asking when NetApp & VMware update TR-3697 (multiprotocol performance comparison paper covering VI3) to include vSphere. It is my pleasure to share that as of this morning we have jointly released a new report TR-3808.
The paper compares the performance of FC, iSCSI, and NFS connected storage along with the gains made in protocol optimization with vSphere as compared to VI3.
What’s Covered in the Report?
TR-3808 compares the scaling of NetApp FC, iSCSI, and NFS connected shared datastores. VMware agreed on testing general purpose shared datastores, as they comprise greater than 80% of all datastores currently deployed.
The concept of a general purpose datastores is the 80% of the 80:20 rule regarding storage with VMware. I tend to discuss the rule during my customer briefings and public presentations. The remaining 20% of the rule is what I refer to as High Performance Datasets. Both of these constructs were discussed in the ‘Multivendor Post to help our mutual NFS customers using VMware‘, which was co-authored by Chad Sakac.
Maybe I’ll dedicate a post to the 80:20 rule in the near future…
The Test Bed
In the testing we deployed an 8 node ESX cluster connect to a FAS 3170 over 1 & 10 GbE and 4 Gb FC links. We used (what many consider the industry standard) Iometer benchmark to generate our workload, which consisted of 4K or 8K request sizes, with a 75% read 25% write mix, and 100% random access pattern. The testing was ran at various load based on the number of running virtual machines starting with 32, then 96, and finally 160 (executing 128, 384, and 640 total outstanding I/Os respectively). Each VM ran the Iometer dynamo configured to generate four outstanding I/Os.
The Performance Results
As you can see there is very little difference between the performance results of any storage protocol when running VMware on NetApp. This first graph is one of many showing I/O throughput.
In these results I’d like to point out that our FC performance results tend to be much higher than what is witnessed when running FC connected datastores on storage arrays which implement per-LUN I/O queues. This is a legacy storage array architecture hold over which is still present in most of the current array platforms within the storage industry.
For more on per-LUN I/O queues see my post ‘vSphere Introduces the Plug-n-Play SAN’
Here’s some results measuring CPU utilization. These are a bit scary at first because the numbers are relative. So, if the CP utilization an workload over FC utilized 10% of the ESX CPU, and the same workload for NFS utilized 14.5% of the ESX CPU, then the relative difference is NFS uses 145% of the CPU required by FC.
Overall, our tests showed vSphere delivered comparable performance and greater protocol efficiency compared to ESX 3.5 for all test configurations. As for overall performance (as measured in IOPS), we found that vSphere delivered results comparable to ESX 3.5 in all the configurations; however, while the total performance was relatively flat we found the ESX CPU utilization was significantly reduced.
In our tests vSphere consumed approximately 6% to 23% less ESX host CPU resources when using either FC or NFS, depending on the load. With the iSCSI initiator, we found that vSphere consumed approximately 35% to 43% less ESX host CPU resources compared to ESX 3.5.
What’s not covered in the Report?
Obviously what is not covered in the current report is testing the performance of High Performance Datasets. These are IO intensive applications like databases, email servers, etc… These types of datasets typically don’t share datastores, don’t share physical spindles, and may write data at a block (or page) size greater than 4KB & 8KB size we see in shared datastores.
Larger block sizes combined with the lack of sharing LUN queues and spindles provide for some very interesting results, unfortunately we weren’t able to come to a consensus on the dataset and access profile to test in order to publish TR-3808 this month.
While TR-3808 doesn’t address this type of working set, there is a VMware report comparing FC, iSCSI, and NFS performance with a 16,000 seat Exchange 2007 configuration in vSphere.
Interestingly enough, every NetApp storage protocol outperformed the same test conducted earlier on an EMC Clariion array with Fibre Channel connectivity to the ESX hosts. I should point out that some of the performance gain in the NetApp config is a direct result of the enhancements in vSphere and would also apply to EMC.
Wrapping this Post Up
I share the opinion with many of our customers when I state that running VMware on a natively multiprotocol storage array provides the best means to scale a manage a virtual datacenter as every storage protocol provides a unique value relative to it’s use case. For some examples see the 80:20 rule or some of the storage related details in ‘VCE-101 Thin Provisioning Part 2 – Going Beyond’
The engineering teams and VMware and NetApp have ensured that customers will always receive the highest performance available. We are nearing the end of another set of tests that will provide additional data regarding performance with storage saving technologies such as data deduplication, thin provisioning, and our performance acceleration module (PAM). You’ll see it here as soon as it’s live (trust me this one is my pet project).