I/O Efficiency and Alignment – the Cloud Demands Standards

-

I was surprised to read in the relatively recent EMC report: VMware ESX Server Optimization with EMC® Celerra® Performance Study that when it comes to aligning the partitions inside of virtual machines EMC has diametrically opposed recommendations based on the storage protocol and storage array which one deploys with VMware. I have to say that such recommendations raise a red flag that I believe needs some additional light shed on it.

300-006-724cover.jpg



A one-off design for a single array architecture in the cloud?

I don’t have all of the details that I’d like to have in order to explain why EMC would promote alignment for VMFS deployments on iSCSI, FC, and FCoE but not promote alignment for deployments of NFS on Celerra. Maybe Chad, Chuck, Zilla or another representative from EMC can jump in here. I believe in fairness that my comments may be harsh and I don’t want to misrepresent the Celerra technology.

The cloud requires standards

What I do know is that the cloud allows customers the flexibility to dynamically migrate VMs based on their needs between various physical servers and storage platforms and this last capability includes various storage protocols. How cloud friendly does the Celerra NFS solution sound if you have to have separate VM templates based on whether they run on VMFS or NFS?

Who wants to have to convert a VM in order to take advantage of advancements in storage technologies, application support tools, or to be able to leverage the cloud to move a VM to an off-site location or external service provider who may not be running Celerra NFS?

cloud.jpg

If you are a cloud services provider, would you support a separate physical infrastructures for VMs created on NFS connect Celerra storage arrays? I don’t think so.

The market needs standards, in order to make the cloud a reality. We should all be asking EMC to share their plans to eliminate this one off design with Celerra and NFS? If I were in the position to select infrastructure architectures for my cloud deployment I would have to rule out Celerra until this issue was resolved.

A bit of background for those interested in the how and why

With virtualization I often state that everything old is new again and this mantra applies with partition alignment. Years ago, when one deployed a system on a SAN they had to properly create the starting partition boundaries on the LUN prior to installing the operating system in order to ensure optimal I/O between the operating system and the storage array. Storage arrays were enhanced and removed the need for this manual process by allowing LUNs to be provisioned based on the operating system which would be installed on the LUN. In this manner the storage array vendor set the starting partition boundaries at the time of storage provisioning and thus streamlining the OS installation process.

Fast-forward to today

Now we are in the age of deploying virtual machines and we have the same issue but instead of having it with LUNs we have it with virtual disks (VMDKs, VHDs, etc). What is almost universally agreed upon by storage vendors, server virtualization vendors, and operating system vendors is that for optimal I/O efficiency the partitions within the virtual disk need to be aligned to the underlying storage array. NetApp along with VMware, Citrix, and Microsoft have jointly published Technical Report 3747 specifically to address this issue.

As I am much more familiar with Windows than Linux I’ll speak to the specifics of alignment with Windows as the GOS. Past versions of the Windows operating system, specifically NT, 2000, 2003, and XP, reserves the first 32,256 bytes of this disk for the boot sector. Customers virtualizing systems running these operating systems end up with a VM that has a local file system that isn’t in alignment with the underlying storage array. Thus have the reprise of the old LUN issue.

The storage and virtualization industry have repeatedly communicated a recommended start partition offset for windows based virtual machines based on 4,096 bytes (4KB) value, which is greater than 32,256. These recommendations are commonly either 32,768 (32KB) or 65,536 (64KB). Note either value is acceptable as it is large enough to store the boot sector and evenly divisible by 4KB.

Here’s a brief list of industry based references for on this point…

IBM: Storage Block Alignment with VMware Virtual Infrastructure

EMC: Celerra IP Storage with VMware Virtual Infrastructure

Dell: Designing and Optimizing SAN Configurations

EMC: CLARiiON Integration with VMware ESX Server

Vizioncore: vOptimizer Pro FAQ

Now I would be remiss if I left out that it is very critical to align the start VMFS partition to the underlying block in one’s storage array, but as this is already addressed by most modern storage array vendors I will stick to the discussion at hand which is alignment of the VMs partition.

Microsoft advances the cause

In the most recent version of the Windows operating system, Vista, 2008 Server, and 7, Microsoft has removed the need to manually align the partition as it begins at 1 MB.

Score one for the guys in Redmond!

There is no need to make any change to the partitions of VMs created from a clean install of one of these operating systems. Note this is not the case for systems upgraded to one of these releases.

This also leads me to ask “Um EMC guys, does deploying Windows Vista, 2008 Server, and 7 as a VM via NFS on a Celerra array change your recommendations around alignment?” From what is contained within the performance report it appears that these deployments might suffer. Can you clarify for us?

So why does alignment matter?

Alignment will have a direct effect on how hard a storage array has to work to service the I/O requests of the virtual machines it hosts. Now you may think that the majority of the VMs you have deployed are not very demanding in terms of their I/O request, but remember the storage array has to serve the aggregated requests of all of the VMs deployed. As your virtualization footprint expands, so does the associated I/O load. This is where I/O inefficiency can raise its ugly head.

How to tackle the alignment challenge

First update the VM templates that aren’t running an operating system that avoids this situation. This relatively simple exercise will stop you from deploying future VMs that are unaligned. Next tackle your most I/O demanding VMs.

BTW – if you really need performance out of a VM you may want to consider upgrading the virtual disks on these systems to the eagerzeroedthick format for some additional performance gains.

As a side note – If you do this on a storage array virtualized by NetApp’s Data ONTAP (this includes NetApp FAS arrays, IBM N-Series arrays, or traditional legacy arrays virtualized by the NetApp vSeries) you can have the highest virtual disk performance while consuming less storage than what is available with a thin provisioned virtual disk on a traditional legacy storage array architecture. Psst – This is one of the many values of virtualizing your storage infrastructure.

Once you addressed these systems the question remains, should I optimize my existing VMs? This is a fair question which has an answer based on it depends. You can either align your VMs (this will cause a small service disruption with each) or you can offset the I/O inefficiency by adding more disks on the back end. While all storage vendors would love to sell you more storage, over the long haul it might be a wiser investment to align the VMs.

In addition, I’d like to ask anyone distributing applications as virtual appliances to ensure that the file system within the appliance is properly aligned along a 4KB boundary.

There are many tools that can help you with correcting this challenge; my personal favorites are MBRAlign and vOptimizer Pro and vConverter from Vizioncore. For clarification MBRAlign and vOptimizer Pro can address the alignment of previously deployed VMs; however, if your infrastructure still has servers to convert from physical to virtual vConverter can ensure that the migration results in a properly aligned partition.

As always VCE or Virtualization Changes Everything (except for some old issues we used to have that were solved but are now back again)

Vaughn Stewart
Vaughn Stewarthttp://twitter.com/vStewed
Vaughn is a VP of Systems Engineering at VAST Data. He helps organizations capitalize on what’s possible from VAST’s Universal Storage in a multitude of environments including A.I. & deep learning, data analytics, animation & VFX, media & broadcast, health & life sciences, data protection, etc. He spent 23 years in various leadership roles at Pure Storage and NetApp, and has been awarded a U.S. patent. Vaughn strives to simplify the technically complex and advocates thinking outside the box. You can find his perspective online at vaughnstewart.com and in print; he’s coauthored multiple books including “Virtualization Changes Everything: Storage Strategies for VMware vSphere & Cloud Computing“.

Share this article

Recent posts

18 Comments

  1. At first when I read the EMC pdf I thought that perhaps it was old info. But now I see that its from April 2009 – this is relatively hot off of the press.
    I hope EMC answers the question that you’re posing becuase I’m sure their customers will be asking as well, especially in mixed environment shops where there may not be a single storage standard. Do I need to keep different templates for my Cellera NFS versus my NetApp NFS/FC/iSCSI or HDS FC storage? Inquiring minds want to know . . .

  2. MBRAlign isn’t officially supported by NetApp. This is disappointing. We would love to align our 500 guests but we aren’t going to walk blindly off a cliff with an unsupported tool to do it.
    What are the chances this will become supported and make it into best practices?

  3. MBRalign and MBRscan are now part of NetApp’s ESX Host Utilities Kit v5.1 and such are officially supported.
    The newly released Kit also introduces support for vSphere and continues to support previous ESX versions.

  4. BK – Nick is spot on here. The versions in the EHU 5.1 and the upcoming VSC are officially supported.
    If you need additional assistance NetApp global support center can also assist you and your team.
    Thanks for the feedback.

  5. Vaughn – thanks for investigating. Digging into this with the Celerra team.
    I agree – there can be only one standard guest-level VM alignment, as customers need to be able to know that moving from one storage platform to another won’t require re-aligning.
    Errors (as you well know from the old NFS disable lock episode) can be made – we are all human after all – but it’s important to catch and correct them if indeed it is an error.
    Will get you and everyone else and update.

  6. Chad – thanks for jumping in to address this issue from the EMC perspective. Customers should be able to have all features on all arrays over any protocol and on-off configs prevent this from happening. Sometimes change requires a little effort (ala like correcting the old NFS locking issue).

  7. IMHO, if you are going to make the statement like; “I don’t have all of the details that I’d like to have in order to explain why EMC would promote alignment for VMFS deployments on iSCSI, FC, and FCoE but not promote alignment for deployments of NFS on Celerra”, then why not wait for an answer *before* you post??
    I have been researching this question (do I, or don’t I, align VMs on Celerra NFS) and my studies are leaning towards no. You can, but it does not buy you anything. The whole concept is block alignment. There are no “blocks” in the NFS storage exported to vSphere. It is a share. Furthermore, the MS article that explains the reasoning behind alignment in the OS says that the underlying disks need to be multi-disk arrays before it is necessary.
    I, for one, am waiting (patiently?) for someone from EMC to explain (down in the nitty-gritty weeds of the bits and bytes) of why not (or why).

  8. @Michael – I don’t understand why you’re not swayed by the recommendations of Microsoft, Citrix, VMware & NetApp on this topic. Are you aware that EMC has revisited their data and as a result have changed their recommendations regarding GOS file system alignment?
    The previous EMC recommendation on alignment over NFS was simply a mistake. it happens, and to their credit they fixed it. Check out the updated EMC Techbook for VMware on Celerra (H5536):
    http://www.emc.com/collateral/hardware/technical-documentation/h5536-vmware-esx-srvr-using-celerra-stor-sys-wp.pdf

  9. Maybe I’m thick headed, but this makes no sense to me. The MS article (http://support.microsoft.com/kb/929491) states “This issue may occur if the starting location of the partition is not aligned with a stripe unit boundary in the disk partition that is created on the RAID.” This means that it only matters on block level devices where the logical disk is striped over multiple physical disks. NFS is a share. There are no physical blocks to align with.
    I am one of those individuals that needs the underlying reasons for things explained.
    The Vmware article on the subject does an excellent job of explaining why this matters at all (http://www.vmware.com/pdf/esx3_partition_align.pdf) and on page 5 (with pretty pictures) you can see the blocks and how they can be missaligned. But NFS takes away the bottom two layers and thus, in my logic, the need for alignment.
    Can you point me to a NetApp paper that explains *WHY*? I would very much like to read it.

  10. Re: [NetApp – The Virtual Storage Guy] Michael submitted a comment to I/O Efficiency and Alignment the Cloud Demands Standards
    Michael,
    The difference between NFS VMFS is where the file system exists. With VMFS the file system is local to all nodes connected to the shared LUN. With NFS the file system resides on the NFS server. For a Data ONTAP powered array (FAS, vSeries, IBM N-Series) the file system of NFS datastores is WAFL (4kb blocks) and for EMC Clariion the file system is UFS (with block size selected by storage admin).
    In the case of VMFS, WAFL, UFS, etc… these file systems must align to the underlying physical storage boundaries.
    In the case of Virtual Machines, their local file systems (NTFS, EXT3, etc…) must align to the file system of the datastore (i.e. WAFL, UFS, VMFS).
    Lack of alignment results in very poor disk I/O utilization. Put in other word, you can work a storage array twice as hard to server the workload when misaligned. Such an oversight leads to underutilized hardware assets and more hw required to run the operation than what is needed.

  11. You telling me that NFS is different form VMFS is somewhat making my point. I fully understand alignment and the need for it in block based storage.
    However, I found a NetApp paper on the Citrix site(http://www.citrix.com/site/resources/dynamic/partnerDocs/BestPracticesforFileSystemAlignmentinVirtualEnvironments.pdf) entitled “Best Practices for File System Alignment in Virtual Environments” and it states it this way…
    1) (This is from section 3, file system alignment) “For optimal performance, the starting offset of a file system should align with the start of a block in the next lower layer of storage.” What the authors are stating is that each layer needs to be aligned with the adjacent layer. But in NFS, because it is a share, there is no lower storage block to align to.\
    2) (This is from table1 under 2.1 VMWARE) When using NFS on VMware, there is only one layer, the Guest OS. NFS is the ONLY one to have a single layer. There is nothing to align it to.
    In essence, the guest OS cannot be aligned with a share. It is not block storage. There is nothing to align it to. Aligning a GOS on NFS would not hurt, but it does not help.
    If you think (and obviously you do) I am mistaken, then please point me to a paper that tells me why? Everything I have read substantiates that NFS does not require GOS alignment.

  12. Re: [NetApp – The Virtual Storage Guy] Michael submitted a comment to I/O Efficiency and Alignment the Cloud Demands Standards
    @Michael I understand why one would think NFS is a single layer; however, it is not. VMFS has a block size which the ESX/ESXi hosts are aware of. NFS has a block size that the NFS server is aware of, so even though this value is unknown by the ESX/ESXi hosts, it still exists.
    LUN hosting file system (VMFS or WAFL) GOS file system
    All three layers must be aligned. Failure to do so unnecessarily stresses the storage array requiring more hardware to meet the workload than what is required with aligned VMs.
    The hardware abstraction provided by server virtualization allows for the non-disruptive migration of VMs between VMFS and NFS file systems. If one align the GOS file systems they ensure the best I/O performance for all VMs, on all storage arrays, one both VMFS NFS.

  13. Yes, I absolutely see the benefit from a standards perspective. Being able to migrate a VM between storage systems (easily done now with vSphere and storage vMotion) without altering the performance has huge benefits.
    As for the performance, I may have had an epiphany. If alignment is necessary in a physical, single disk system, then it would still be (obviously) necessary from a VM on NFS. I am still looking into that as I was under the impression (from the MS article I mentioned earlier) that alignment was only necessary with some sort of RAID (aka multi-disk) configuration.
    As for aligning to NFS, this still seems like it would be dependent on how the NFS alignment was done. If, for example, the OS running the NFS (the SAN) auto-aligned each file with the beginning of a physical underlying block, then any GOS/VM would be automatically aligned. Right?
    Thank you for taking the time with me on this.
    -M
    P.s. The idea that NFS is a single layer was from the NetApp article I mentioned. And it is not I was suggesting that NFS was a single layer, but that NFS removed the blocks from the equation. This is pretty much what the article said. But then it went on to say that one should still align all GOSes.

  14. @Michael – now your starting to get it!
    When using VMFS, you need to select the correct lUN type. With NetApp this is a VMware LUN type. When selecting this type the storage controller ensures that VMFS is aligned.
    With NFS with NetApp the fiel system, WAFL, is already aligned.
    When we create a VMDK it is stored correctly, in aligned blocks.
    The challenge is ensuring that the file system created inside of the VMDK, by the GOS, is aligned. Historically OS installations did not align partitons to array block boundaries. This is why Storage arrays have LUN types. By having LUN types we could force the alignment; however, with VMDKs, storage arrays don’t have any means to make such adjustments.
    So we must manually align the starting partition offset for all legacy OSes, like Windows NT-2003. Note 2008 is aligned by default. Alignment is also an issue with LINUX distributions, but don’t know which distributions and version have addressed this issue.

  15. So as it turns out, I was right to begin with. There is no performance gain from aligning a GOS running on VMware over NFS. I have had several conversations with different individuals both on-line and off and it comes down to this; NFS separates the blocks from the VM and exposes the vDisk to the GOS as a single drive. There is no advantage to aligning a single drive. So the only reason (and it *is* a good one) to align the GOS drives is compatibility with other, block based, storage arrays.

  16. Re: [NetApp – The Virtual Storage Guy] Michael submitted a comment to I/O Efficiency and Alignment the Cloud Demands Standards
    Michael,
    Im sorry but you are incorrect. It is critical to align the partitions on all virtual disks for any GOS that does not provide proper alignment.

  17. Re: [NetApp – The Virtual Storage Guy] J.ross submitted a comment to I/O Efficiency and Alignment the Cloud Demands Standards
    @J.Ross Thank, I am aware of Paragons Alignment Tool (PAT); however, Ive yet to use it either as a stand alone tool or as a part of their Hard Disk Manager suite. Id like to know more about its capabilities, especially in regards to job management and reporting. If anyone has more to share, Im all ears…

Leave a Reply

Recent comments

Jens Melhede (Violin Memory) on I’m Joining the Flash Revolution. How about You?
Andris Masengi Indonesia on One Week with an iPad
Just A Storage Guy on Myth Busting: Storage Guarantees
Stavstud on One Week with an iPad
Packetboy on One Week with an iPad
Andrew Mitchell on One Week with an iPad
Keith Norbie on One Week with an iPad
Adrian Simays on EMC Benchmarking Shenanigans
Brian on Welcome…