There has been a great deal of focus on high performance storage. A part of this focus and even hype has been around IOPS with some vendors claiming millions upon millions. At some point all of the IOPS claims become just useless noise.
While IOPS numbers may provide some value, keep in mind that what you test in the lab may have very little applicability in the real world. And what the vendors often claim will most likely be even further afield from the realities of your environment.
Keep in mind that the number of IOPS you get is based on the block size of your applications. Therefore, since block sizes are variable it is difficult to predict what your real world IOPS requirements will be. For example one application can have a 4k block size, another 8k blocks and another 64k blocks, etc.
You are probably already aware that most of the IOPS numbers that vendors tout are based on small block sizes. IOPS testing with a 4k block size will result in twice the number of IOPS as 8k blocks and 16 times the number of IOPS as 64k blocks and so on. One vendor that I was aware of that claimed millions of IOPS was using a 1k block size to achieve really high numbers even though it was not reflectiive of the real world.
I often get the question - is doing a head-to-head IOPS comparison between storage systems using the same workloads valuable to determine which one performs better? And the answer is of course - not necessarily. Some storage systems may work really well with one workload but may act differently when multiple mix-workloads are thrown at it. Often contention of resources can greatly impact performance for one or all applications accessing the storage system. Some architectures react better than others in mixed workload enviroments. It is also essential to understand that most performance tests are done in optimal conditions. I suggest running a performance test and then fail a drive. How well does the storage system perform when it needs to do a RAID rebuild at the same time it is doing read/write operations? How well does it perform when it is also running a replication process? Remember, the backend can also impact primary I/O performance and again, some storage systems are better architected than others.
Testing latency can often be a more accurate predictor of what you will experience in real world environments. IOPS is dependent on block sizes, which is variable and therefore ulitmiately unpredictable. However, latency is more of a constant and therefore more predictable. Storage latency simply means the amount of time it takes for the storage system to respond back to the application. Getting sub-ms response times can result in query times being reduced signficantly.
Analogy: IOPS, Latency and Throughput
I've created a simple analogy that explains IOPS, Latency and Throughput. You ordered something online and it takes two days for that box to be delivered to you. Inside that box is one large item. You then order something else online and it also takes two days for that box to arrive at your door. It is the same size box but within it are 100 small items. In this analogy the Latency is 2 days, which is the same for both boxes. The IOPS is the number of items that were delivered to you. In the first box it was just one item and in the second box it was 100 items. So the latency was exactly the same but the IOPS were drastically different!
In both cases you got your boxes within the same amount of time but the IOPS varied based on what was being sent to you.
This brings up throughput. Using the same box analogy, throughput is the number of boxes being delivered that fit in the truck. Assuming that all of the boxes are the same size, the truck can carry 200 boxes at full capacity. Therefore your throughput, in this analogy, is 200 boxes. Again, the number of items (IOPS) within those boxes will vary greatly based on the size of those items (block size).
Latency is 2 days, maximum throughput is 200 boxes and IOPS is an unpredictable variable.
Caching and Storage Performance
Caching is also important for many storage systems in order to provide good performance. However, it will depend on how cache friendly your application workloads are. Industry-wide the consensus is that you should generally get between a 60-70% cache hit ratio - so the chances are you will benefit from caching.
I believe that partitioning is a good feature for caching because it can address the mixed workload contention issue that I discussed above. I also like caches that allow you to pin data into the cache so that you can ensure high performance of a hungry application. Additionally, the larger the cache memory the more likely you will get cache hits because it will keep more data in the cache pool.
Some storage systems support only read cache and typically reads are the majority of your I/Os. However, you will obviously be creating writes as well (otherwise there would be nothing to read) and a write-back cache can be quite beneficial. Write-back caching is a bit more complicated to implement for the vendors and not everyone has this feature so be sure to ask whether their storage systems support it or not.
Storage performance testing with a cache should be done mindfully. If you are getting 100% cache hits during your testing then it is giving you a skewed view of that storage system's performance. It is improbable that you will get 100% cache hits in the real world. I do think it is important to do performance testing with caching turned on and not just to the disk drives - since caching will play a role in improving performance. Assume between 60-70% cache hits with about a 60-70% of the I/Os being from reads - as a general rule of thumb. You may have specific types of workloads that go outside of these parameters but my recommendations are pretty reflective of the broad market.
The Importance of Storage Performance
Storage performance is not an exact science since there are so many variables. Interestingly, all of the current attention on performance will help to solve many problems for the entire market...eventually.
The goal is to remove performance off the table as an issue by creating architectures with enough head room so that the storage system will never be the bottleneck. We will get to a point where you won't have to worry about storage performance testing because the storage systems will be such speed demons that it will handle anything you throw at it. Such systems do exist today but you typically have to pay a premium for them. Performance is important but what are you willing to pay for it?
In the end it is the combination of price and performance that will alter the landscape forever and not just performance alone. The good news is...this shift is occuring even as we speak.
The storage landscape has gone through a number of transformative stages in the last two decades. EMC and NetApp emerged in the first stage, challenging the status quo and essentially creating the storage networking industry. They made storage that was less expensive and relatively easier to manage, and delivered intelligence to protect and manage data. The second stage saw innovators such as 3Par, Compellent, Data Domain, EqualLogic, and Isilon, all with unique architectures and technologies that, among this very diverse group, delivered far easier management, incredible scalability, capacity optimization, and intelligent tiering.
These two stages were good for customers (i.e., the end-users of these products and solutions). Innovation and competition drove down pricing and created a wide range of features for greater performance, ease of management, scalability, reliability, recoverability, infrastructure optimization, and more.
We are now at the threshold of a third transformative stage of the evolution of storage, one that is focusing very heavily on performance.
Performance Isn’t Everything!
While this next stage of storage is focused on providing transparent and cost effective performance, all of the other capabilities introduced in the previous transformative stages are also requisite. NAS and SAN support, data protection, replication, capacity optimization, highly virtualized storage, and a bevy of data management software features and capabilties combined with performance also define this next stage because each stage is progressive and all technological advances must be integrated if end-users are to achieve maximum value.
Performance For Everyone
You can debate the merits of performance, but the bottom line is that worrying about it should just be taken off the table. Once upon a time, storage was complex, unreliable, inefficient, hard to scale, and expensive—and all of those issues have been addressed. It is now time to remove performance as a challenge and add it to that list of storage issues that no one ever has to think about again.
There are studies that show that only between five and ten percent of all I/Os require high performance using some flash-based technology. But which five to ten percent? When do they need that higher performance? And if that small percentage of I/Os is vying for resources that other less performance-driven I/Os are consuming, will that impact the business? This is why performance should be a requisite just like ease of use, scalability, reliability, and other storage system features. This is why performance is so critical.
If you have an unlimited budget, than performance has always been in your grasp. But, in this next transformative stage of storage, it is essential that performance is commoditized, which can be defined as “the movement of a market from differentiated to undifferentiated price competition and from monopolistic to perfect competition." - Wikipedia
This is essentially what is happening in storage: performance is being addressed within different layers and offerings in the marketplace. And, over time, the preferred methods and solutions will become the mainstays in every data center. We are at the point in the process where the market is just beginning to shake out the winners and the losers. What will eventually emerge are performance solutions priced for consumption by nearly everyone, regardless of their IT budgets.
The reason there is so much attention focused on storage performance right now is that we are at the threshold of delivering the technology at a cost point that enables us to remove performance as an issue for everyone, in every environment, and for every application to ensure that no one ever has to think about performance again. This will impact profitability, user satisfaction, and operational efficiency.
Here is what is going on with Microsoft and Hyper-V:
- From a technology perspective Microsoft Hyper-V is finally a viable alternative to VMware ESX. It has analogous functionality as well has some competitive advantages. It is being used for mission-critical applications in a large number of customers and this will continue to grow.
- VMware finally has a serious challenger and they don't get much bigger and more powerful than Microsoft. This is good for the users - competition raises the the bar on every level including pricing, service and technology. Microsoft is a real market threat to VMware and have the resources, brand and muscle in the Enterprise to change hearts and minds at the C-level. Further, they have leverage that no other company has in terms of everything they offer.
- Interestingly, Microsoft is getting something right that VMware has failed to do and that is delivering a compelling virtualized storage stack. SMB 3.0 provides unique capability and VMware has nothing that even comes close.
- MS still has some significant challenges - most notably Systems Center. vCenter is very easy to use and there are hundreds of thousands of IT folks trained on it. Systems Center is a bit of a bear and rather complex.
Incumbency is hard to overcome in part because the customers have invested time and money becoming experts as well as optimizing and tuning their environments. Ignorance is another challenge - there is a large IT population out there Microsoft has to educate and if they aren't willing to listen...
VMware is a sales and marketing machine and they are focused on one thing - virtualization. Microsoft has a huge portfolio of applications and solutions and therefore far less focused.
VMware has stumbled a bit over the last few years falling short of coming out with the NEXT BIG THING in the virtual data center. However, software defined networking and the NSX is very compelling and will give something else for Microsoft to address going forward. NSX isn't a slam dunk…yet… and VMware has a long road ahead driving it into the market - but it is compelling and has the potential to change the landscape.
- I am associated with the Virtualization Technology Users Group (VTUG.com), which consists of over 5,000 virtualization professionals and we are seeing some real traction with Hyper-V. We are talking to customers considering a multi-hypervisor strategy and some are actually sweeping the floor replacing VMware with Hyper-V. The motivation for this has largely been pricing with huge savings. Of course pricing isn't everything but if all other things are equal (or close enough to equal) naturally cost will be a major factor. Another very important factor is the channel. System integrators, VARs and resellers all are selling VMware - making it difficult to compete. Bringing Hyper-V to their customers creates another conversation and opportunity for the channel. I am seeing some channel partners really embrace Hyper-V as a way to differentiate themselves.
So what happens in the virtualization wars going forward? A large segment of VMware's customer base will be very loyal to them. However, Microsoft will continue to gain in market share getting both greenfield opportunities, as part of a multi-hypervisor strategy and completely replacing VMware in some cases. Microsoft is shooting for full-blown domination of the market re: NT crushing Novell. I am skeptical that this will happen - certainly not any time soon. However, I do believe Microsoft could eventually have the most installations as VMware continues to generate the most revenue. VMware therefore needs to focus more on how to get more revenue from their loyal base. EMC has done a great job doing this with their storage customers and I am certain VMware will take a page from their book.
Additionally, there will be large pockets of KVM (e.g. Sears with 10,000 and eBay with over 40,000 KVM installations) within the data center and Amazon EC2 outside the data center.
Microsoft is on fire. I have rarely seen them so aggressive, focused and productive. In terms of virtualization - they are acting more like a startup (in some ways) than a massive software company. They are developing real intellectual property in record time. It is actually a bit inspiring to see such a large company driving so hard on all cylinders to be a market leader. And talk about a war worth fighting - the virtual data center and the cloud are the future of IT.
I have been working with the Virtualization Technology Users Group (VTUG) . VTUG is an independent virtualization users group with over 4,000 end user members that focuses on all platforms including ESX, Hyper-V, KVM and EC2. The 10th Annual VTUG Summer Slam event is taking place in Brunswick Maine and will have more than 1,000 IT professionals and over 50 vendors. Keynotes include technical evangelists from VMware and Microsoft. This is an excellent event and they end the day with a huge lobster bake that also includes ribs and chicken! And the live music is cool too.
VTUG is doing a pre-event boot camp the day before Summer Slam in Maine. The focus is on Hyper-V for the VMware expert. It is a hands-on lab and is about six hours long - so it will go in-depth. This is a first come, first serve opportunity. The boot camp is free so take advantage of it.
It is interesting how many storage vendors participate at Summer Slam. VTUG is running a program called "Bring Your Storage Peer". And if you do there are four chances to win some cool prizes. Learn more about the "Bring Your Storage Peer" program.
VTUG is branching out its events to other parts of the country with Silicon Valley Slam on August 1. It will take place at the DoubleTree in San Jose. VMware, Microsoft and Amazon will be speaking at this event.
VTUG is now conducting webcast events as well. The first upcoming webcast is Hyper-V for the VMware Expert. Let's face it, Hyper-V is becoming a contender in the virtualization arena. The more you know about it the better it is for you.
All of these events are free to IT end users.
I'll be at all them - I hope to see you there.
I take my hat off to Gartner for creating their Magic Quadrants! A marketing and industry coup! Companies and organizations actually pay attention to it - both end users and vendors. Gartner's competitors have tried to create alternatives but to no avail. However, as Spider-Man's uncle Ben once said - "With great power comes great responsibility" - and this superduper analyst firm's analysis lacks rigor, intellect and insight worthy of its role in the industry.
The new Garter Storage MQ is based on the following proposition:
"Improvements in scalability, availability, performance and functionality of midrange storage systems have blurred the boundaries between network-attached, midrange and high-end storage systems."
I am not sure what this really means. Does it mean now that companies using the EMC VMAX or HDS VSP should take a look at Dell EqualLogic or NetApp FAS as viable 1-for-1 alternatives? And should these same customers consider using NFS or CIFS in place of FC or iSCSI since, as the above statement implies, NAS and SAN scalability, performance and functionality are pretty much the same? That is a big statement to make - NAS and SAN are interchangeable and fundamentally synonymous with one another. Perhaps that is not what Gartner is saying but it isn't very clear. And what about the reverse use case - should IT professionals seriously consider high-end storage systems to replace mid-range NAS or SAN storage solutions? Does the IBM DS8000 now compete with HP LeftHand or Nexsan NST?
Gartner also states that no extra credit is given to vendors that serve all market segments. Okay so market segments don't matter? So why are they talking about market segments in their analysis:
Dell Caution: "Lack of presence in the traditional high-end storage market and its relative lack of success in the fast-growing NAS and object storage market segments are limiting its appeal as a storage vendor."
So it would seem that you get no extra credit for serving all market segments but get penalized if you don't ("lack of presence in traditional high-end storage") or if you aren't particularly successful at it ("relative lack of success in the fast-growing NAS and object storage market segments").
Unless you are NetApp - "NetApp is focused on improving FAS and V Series' competitiveness in large mission-critical environments. Early signs are promising, but this is still a work in progress that requires ongoing improvements in marketing, sales, product features and professional services."
It seems that NetApp, in the second highest position on the MQ doesn't really need a high-end storage system but a solution that is on its way there in order to earn its top spot. It doesn't appear to matter that NetApp has been trying to compete in the high-end storage space for nearly a decade and has failed to do so - and yet it looks like they get points for saying they are almost able to. I am not trying to put down NetApp - they are a great company - but the Gartner analysis is contradictory.
It is also very curious that HP was the visionary leader in storage in 2011 and is fifth place in 2012. How do you rationalize such an extreme gap within 12 months? We may never know since Gartner doesn't feel the need to explain itself.
And the whole "niche player" thing is just incorrectly named. The vendors in this box aren't focused on a niche but the broad storage market. This has always been an issue with the MQ but we all just tralala along and ignore the fact that these vendors aren't niche players!
I could go on and on about the puzzling analysis that Gartner has put forth. But there is a single point in all of this: the Gartner Storage MQ premise is fundamentally flawed, subjective and inconsistent. It attempts to paint with a very broad brush things that require detailed analysis. Subjective analysis can be valid and useful but its arguments and rationale need to be able to stand up to rigorous scrutiny. This does not.
What is needed is an analysis based on the real world focusing on both traditional and emerging storage solutions that is segmented and with more detailed information. It is hard to imagine anyone - especially end user customers - getting any real value out of the Gartner Storage MQ.