Disruptive Technology: Business Analytics on Flash


What if your applications could run 10X faster?

What if your rack space could be reduced by 96%?

What if you could accelerate your business 16X?

Guess what, you can have all of these benefits without changing your application or work process with flash!

I joined Pure Storage because I believe flash storage will have a seismic impact on the datacenter.  I plan to share examples of how Pure Storage is delivering such transformations for our customers. Today’s post is the first in what I expect to be a long running series…

Recently we helped a customer accelerate their business analytics by reducing the time required for an analytics process from 11 days to less than 8 hours! Allow me to elaborate so that you can truly appreciate the impact of this story. This customer has an analytics process that is comprised of a batch job that runs for 19 hours and requires 11 iterations to be run in order to compile the data. This means the business could not begin analyzing and interpreting the output until 11 business days after the start of the process.

By simply replacing their legacy storage architecture with a Pure Storage FlashArray, they were able to reduce the process of the batch run from 19 hours to 43 minutes. To clarify, we reduced the time required to run the report from 1,140 minutes to merely 43! As a result the analytics process has been reduced from 11 days to 1!

Analytics BatchRuns

In addition to this gain the customer has been able to drastically increase the number of users and concurrent batch and ad-hoc reports being run from 3 users each running 2 reports to over 50 users. This is a 16X gain in processing by increasing the number of concurrent reports from 6 to 100.

Flash: Defining Disruptive Technology

In many ways this customer was like many of you. They made wise investments in the best of breed technology provided by their existing storage vendor. In this instance the customer invested in a six-engine ‘scale-out’ storage array with a hybrid configuration comprised of approximately 1,000 HDDs and 100 SSDs.

This configuration, designed solely for performance, required 8 data center floor tiles to operate. Specifically, this behemoth was comprised of one 24” cabinet for the 6 data engines and five 30” cabinets for the disk and solid-state drives. All of this was replaced by 8RU (rack units) of FlashArray.

Analytics Banner

These results demonstrate the impact Pure Storage is delivering daily. Beyond the acceleration of analytics, the customer received a massive reduction in datacenter resources in spanning rack space, power and cooling. This was on top of a massive reduction in storage acquisition and operational costs (I forgot to mention that the FlashArray was installed in under 15 minutes).

The All-Flash Enterprise? Maybe the Flash-Powered Economy!

The performance of flash can deliver astonishing performance gains without any other changes in the environment. When you can deliver flash at the price of spinning disk, you introduce a new paradigm to customers. They can accelerate their business and reduce time to market while reducing their costs. I’m beginning to wonder if we should be referring to the All-Flash Enterprise as the Flash-Powered Economy.

We moved mountains (of hardware and time) for this customer. Contact us, or one of our VAR partners, and let’s discuss what we can do for you.


  1. Vaughn, excellent post! BTW, you are going to have to do one of two things, or both. One, re-write Virtualization Changes Everything with Flash … second a new book titled Flash Changes Everything… and that one I will discuss with you! 😉

    • Thanks John. Mike and I are working on what we do next. I think an update to the book is most likely as it allows us to address the broad storage market and leverage our combined market knowledge.

  2. V- As a Pure Pre-sales guy I’ve seen these types of transformations take place all up and down the west coast since I joined Pure ~8 months ago. It’s remarkable and sometimes it seems almost too good to be true, but it really is true.

    At one of my customers, PeopleFinders, who do background checks and other services (case study coming out very shortly) I was astounded at what we were able to do for them. They had a hellacious ETL process of merging 16B+ rows across a several linked SQL Servers that actually took 2 months to complete so they could then monetize the resulting data-sets. If they were lucky, they could run it 6 times a year but running it more frequently meant materially more revenue and put them in a MUCH better position relative to their competitors. Unfortunately however, on their traditional scale out disk array, the job would often get to the final hours after nearly the full 2 months of running and then crap out due to I/O waits and the customer would be forced to restart the job, cross their fingers, and wait another 2 months . This wasn’t experimental data btw, it was result sets they needed to run their entire company and deliver accurate results to the tens of thousands of customers they serve everyday. At that time it was sitting on a scale-out array like you alluded to above but it didn’t have the pretty blue lights depicted in your graphics 🙂 rather it was covered “eye catching” bright yellow colors that ironically bear a striking resemblance to the color found on Yield signs found around the neighborhood where I live.

    I wasn’t sure what to expect but we dropped an FA-420-23 in there, make no tweaks except maxed out the queue depths to 256 and re-ran the job, sat back and wondered what frankly would happen. I knew we would be faster but it was impossible for me to quantify to what degree before hand so I just said, “let give it a go and see what happens!”. So, while the job was crunching away, I was watching the array at the same time over our cloud monitoring system (port opened and permissions allowed via our RA controls) and noticed it was reaching some insane numbers in both the bandwidth and IOPS categories.

    Net/net we were all shocked when the job went from 2 months down to 4 hours and 15 minutes on Pure. The customer was blown away and the CEO declared we have changed his business forever. They are literally now trying to push said scale-out disk array which is taking up 60% of their power and cooling costs and 4 racks out into the alley way behind their building. We tried to find an after market for it but nobody seemed to want to pay more than 8K for the entire thing which was insulting to the CEO. What is unfortunate is they’ve sunk nearly $1.2M (recently) into it and now they view it as nothing more than a boat anchor. We achieved a 3.2:1 data reduction which quickly climbed to slightly above 10:1 after they spun off a few read/write performance free clones (for more new ways to do parallel processing) thereafter. Needless to say we are doing all of this at a fraction of the acquisition cost (we are actually approaching SATA costs) of the traditional competitor and delivering very material savings in the form of massive environmental cost reduction.

    If they had stayed on the spinning disk path, they would have been forced to move into a colo, but more controllers and disk (perhaps some expensive SLC flash to deal with old school parity RAID algorithms too) because they were simply running out of everything – rackspace, power, and cooling and busting at the seams. Pure changed everything for them and the need for the new colo has now been deferred indefinitely.

    In my career I’ve can’t say that I’ve ever seen anything quite like that and I’m both proud and blessed to be at the forefront of this revolution, with this particular technology stack (Purity). There’s nothing quite like it..

Leave a Reply