From the title of this post you may be surprised to find that I’m a huge fan of hyper-converged infrastructures (HCI). Frankly, they fill a market need that for years customers have either been struggling to address with storage arrays or worse ignoring all together. HCI is a perfect solution for retail locations, bank branches, militarized and mobile emergency response vehicles, etc; all of which require data center services on a scale that is too small to justify the cost of a storage array.
The HCI market has the potential to be huge; however, will it replace enterprise storage arrays? Simply put, yes and no.
I ask you to read on before you or another reader posts the comment stating I’m a “hater” or the “Law of Instrument” (aka – if you only have a hammer…).
Hyper-Converged Infrastructure 101
I believe HCI shoudl be viewed as a form of converged infrastructure, one that is comprised solely of server hardware (and top of rack network switches). This software-defined model leverages a virtualization layer to provide shared data services comprised of local storage, a distributed file system, data mirroring mechanisms and Ethernet networking.
All persistent storage, this includes disk and flash, inherently have media errors. These imperfections require the storage system to provide data protection in the form of RAID, mirroring, erasure encoding, etc, to ensure data availability in the of a component failure. The HCI architecture implements data mirroring on a per-virtual disk basis. HCI vendors do not use uptime like storage vendors do (i.e. five nine’s availability), instead they speak in terms of Failures To Tolerate (FTT). I think it is fair to compare the setting of FTT=1 to RAID 5 on an array (four nine’s of availability) and FTT=2 to RAID 6 (five nine’s of availability).
Low Storage Utilization Challenges HCI as a Storage Platform
The storage industry referrers to the storage model in an HCI as a shared nothing architecture. This definition loosely means the architecture is distributed and each node is independent and self-sufficient. The HCI platform provides some truly unique and beneficial capabilities in areas like granularity of deployment and scaling, per-VM data availability and service policies, and consolidation of operations; however the trade-off of HCI is low storage utilization due to the shared-nothing architecture.
Consider the example below (based on the EMC VSPEX BLUE).
The largest area of capacity loss in the HCI model is the data mirroring and node rebuild reserve required to support the number of faults to tolerate. The overheads of both vary as the cluster scales. For example, set FFT=1 on a four node HCI and usable capacity is cut in half and one must reserve 25% of the capacity to support the rebuild of the data in the event of a single node failure.
Again, consider the example below (based on the EMC VSPEX BLUE).
As one adds nodes to an HCI deployment, the overhead of the node rebuild reserve reduces, whcih is good; however, the increase in raw storage capacity also increases the number of media errors. With such capacity increases, one needs to revisit resiliency as the setting of FTT=1 does not protect against media errors during a disk/node failure and the data rebuild process. Hopefully this example helps express why I view FFT=1 in the smae vein as RAID 5 – appropriate for small deployments but fundamentally lacking for deployments at scale.
It would be remiss of me not to mention that HCI architectures vary from vendor to vendor and some are more efficent than the platform I used as the example in this post. Consider the larger perspective: shared-nothing architectures trade off hardware efficency in order to deliver value in other areas. Those that offer means to reclaim some of the loss through data reduction technologies.
Hyper-Converged Infrastructures Help Customers
I don’t think we should include hyper-converged infrastructures in storage infrastructure discussions. Maybe the marketers who gave us product names like ‘Virtual SAN’ and marketing messages like ‘No SAN’ should be commended for injecting HCI into the storage conversation… I guess they did their jobs rather well. With that said, it falls on the shoulder of the technically inclined to understand and position the advantages and trade-offs of HCI based on deployment requirements.
Customers have significant pain points delivering highly available, performant, and cost-effective storage. Until recently, storage platforms have not
advanced in lockstep with the advancement of compute and networking, which has led many to seek innovation and new storage technologies.
It is my opinion that HCI is an ideal means to provide storage services in environments where the inclusion of a low-end storage array is simply a financial challenge. The requirements of this market are very different than what is expected from the core storage infrastructure within a data center – which is the crux of this post. I view all-flash array architectures as more capable and thus appropriate for next-generation storage infrastructure in part as leading AFA platforms deliver effective storage capacity that is greater than the raw capacity of the platform. This benefit directly translates into cost savings. Attempting to compare the storage utilization of an HCI vs an AFA should have the same impact of when you consider the fuel efficiency when purchasing an automobile. Are your purchase decisions influenced by similar vehicles when one gets 5 miles to the gallon of gasoline and the other 50?
I understand my views sound like a vendor bias – this is a fair critique – but I’m not alone in my view. Web-scale architectures 2.0, aka data center disaggregation, is composed of racks compute and shared all-flash storage. Sounds like a new take on an old model, but let’s save the data center disaggregation discussion for a future post.
I fully expect HCI to dissplace a portion of the low-end storage array market. Face it, dissruption is good for customers. Kudos to the hyper-converged infrastructure vendors; they have helped customers with a long-standing challenge and as a result they have created a net-new market. I’m excited to see where these platforms may take us.
What are your thoughts on HCI? Where does it fit well today and where may it go in the future? All comments welcome.
Disclosure: I work for Coho Data…
For a similar viewpoint on this topic, see our CTO, Andy Warfield’s take on hyperconvergence from last year… http://www.cohodata.com/blog/2014/04/08/hyperconvergence-is-a-lazy-ideal/
I guess I’ll be the first HCI person to respond here (SimpliVity in this case).
I’m not going to question the calculations on the utilization of space since it’s not my platform you used as an example, but I think that the point you were trying to make is that there’s a lot of capacity waste within HCI platforms. The ability to logically utilize a capacity greater than the raw physical capacity is not unique to AFAs. In fact, there is at least one HCI vendor that provides this level of data efficiency. (One guess who, because there’s no respect for you if it takes two!) In the end, does it really matter how much waste is inherent in the design if the customer can get the needed capacity, without over purchasing compute and at a reasonable price?
Back to the title of your post. I see no backing up of the claim that HCIs should not be considered storage arrays. HCIs provide shared storage, some level of data protection (obviously depends on the implementation, and up to the customer to decide the necessary level required) and a level of performance that (depending on implementation) rivals some storage arrays. Did I miss something? (possible, it is midnight as I type this)
I also believe your definition of the market segment that HCI fits into is too narrow, and i think my friends at Nutanix would probably concur. This is likely an “agree to disagree” point, and may explain why you’re at Pure and I’m at SimpliVity. 🙂
Brian,
Thanks for posting your views. In fairness my examples are drawn from one HCI platform and are not representative of the entire industry.
At present I don’t see AFAs and HCIs competing for the same market, that’s not to say that this may not change in the future. Utilization is only one element of consideration when positioning a platform for consideration as a storage infrastructure. I guess the key may lie in our definitions of infrastructure. I believe ubiquity is key. This means all applications, Fiber and Ethernet fabrics, physical and virtual servers, etc. Will the HCI platforms expand into these areas? They may and at that time we can revisit this thread. Until then I am a huge fan of HCI for all of the value it can deliver in small environments and I’ll hang my hat on AFAs for data center needs at scale.
Thanks for sharing your perspective. It adds to the conversation.
— cheers,
v
Vaughn:
The example proffered above applies primarily to VSAN based HCI systems. I will let the Nutanix guys speak for themselves in terms of how their storage architecture differs, but from what I know it doesn’t follow the example above exactly. I know that SimpliVity doesnt follow the model above. You also need to take into account that most of the more entrenched HCI vendors leverage dedupe and compression in their storage models which provides for far greater storage utilization rates. Some utilize RAID in their data protection schemes, some do not. At this point each of the vendors playing in the space today follow different storage architecture path.
More to the point, what is a storage array? It’s simply software running on dedicated hardware. HCI follows that model, but removes the extra costs associated with a dedicated piece of disparate hardware for a single function. When you look at the legacy stack of equipment in most data centers, what you find are commodity x86 resources that are essentially running a modified linux kernel, your own Pure Storage system falls into that catagory. HCI sees that there is no reason to place those resources into their own tin boxes, and essentially virtualizes those functions. What you are left with is a building block that accounts for the storage, compute, and in some cases data services in a single appliance, instead of 3-8 appliances.
This approach is not without its challenges, and frankly, I don’t believe that HCI works at true scale, as the TCO value prop breaks apart once you get to around 16 nodes. But for customers looking to provision less than say 750-1000 mid weight virtual machines, HCI is probably one of the best solutions available.
Gabriel,
Well stated. As I mentioned in my reply to Brian (above) I think the difference in perspective may lie in one’s view of infrastructure. There’s no denying that the disk-based storage architectures originating from the mid-1990’s and early-2000’s are struggling to address modern data center needs. As such innovative platforms like AFAs and HCIs have emerged to address the modern needs of IT organizations.
I think we can agree that it’s a great time to be in IT.
–cheers,
v
This discussion, in part, reminds me of the old FUD on NetApp SAN: “It’s not a SAN. It’s an emulated SAN (or LUNS).” Always used to make me crack up especially given the first word in the LUN acronym. Anyway, is HCI a storage array? Who cares? The message that I plug into in your blog is what value does it deliver. If it delivers rock solid data services to a market, I don’t think the market really cares a) whether or not it’s flash-based; b) whether or not it uses dedupe, compression or RAID; c) the size of the snap reserve. None of that stuff matters. That’s plumbing. All the customer cares about is when they turn the faucet, water comes out. (A lot of money in that. Just ask Kohler.) In the end – and I think this is what you’re getting to – is customers buy technology for economic reasons; not technology reasons. They don’t buy it because it is or isn’t HCI; is or isn’t an AFA; is or isn’t a hybrid array. Customers, I’ve found, are far less fascinated by vendors splitting technical definition hairs and far more concerned about how we can help them make, save (or both) money. Could be soup cans and kite strings or *emulated* soup cans and kite strings – they really don’t care. Well – some like to spin the propeller hard but the “my RAID is better than your RAID” discussions have caused more spontaneous naps than Mother Goose.
Wasaa interesting debate! ^^
This is how EMC attacking NetApp but this time, got attacked in same way.
(Not only EMC but also VMware, Dell, Fujitju and so on)
But all of us know that cap is not everything.
Nothing is wrong and everything is not right. People think differently and it is the period of conversing..
From here… HCI is only hybrid storage? Simply it is a server!
I want to hear some sound from server vendors too.
Great post, Vaughn.
I think HCI has its place in the infrastructure. It will definitely cannibalize a portion of the low-end SAN market. It may not offer the same levels of storage efficiency, but it may be “good enough” for some customers, and an onramp to scalable, shared storage. Everyone is scrambling for share in that market, as well. Nutanix, VSPEX, SimpliVity, HP, Scale Computing, and EVO:RAIL, among many contenders. So, the trend is real.
I do wonder how long it will be before the HCI market gets commoditized. Three years? HCI is certainly in line with the move away from specialized infrastructure, and toward interchangeable, modular, commodity hardware presented as a shared resource pool. However, as with all simple solutions, abstraction and management will be the key to operationalizing it. And cost/value will determine whether it is an effective solutions.