Storage Performance: Important Things to Consider
There has been a great deal of focus on high performance storage. A part of this focus and even hype has been around IOPS with some vendors claiming millions upon millions. At some point all of the IOPS claims become just useless noise.
While IOPS numbers may provide some value, keep in mind that what you test in the lab may have very little applicability in the real world. And what the vendors often claim will most likely be even further afield from the realities of your environment.
Keep in mind that the number of IOPS you get is based on the block size of your applications. Therefore, since block sizes are variable it is difficult to predict what your real world IOPS requirements will be. For example one application can have a 4k block size, another 8k blocks and another 64k blocks, etc.
You are probably already aware that most of the IOPS numbers that vendors tout are based on small block sizes. IOPS testing with a 4k block size will result in twice the number of IOPS as 8k blocks and 16 times the number of IOPS as 64k blocks and so on. One vendor that I was aware of that claimed millions of IOPS was using a 1k block size to achieve really high numbers even though it was not reflectiive of the real world.
I often get the question - is doing a head-to-head IOPS comparison between storage systems using the same workloads valuable to determine which one performs better? And the answer is of course - not necessarily. Some storage systems may work really well with one workload but may act differently when multiple mix-workloads are thrown at it. Often contention of resources can greatly impact performance for one or all applications accessing the storage system. Some architectures react better than others in mixed workload enviroments. It is also essential to understand that most performance tests are done in optimal conditions. I suggest running a performance test and then fail a drive. How well does the storage system perform when it needs to do a RAID rebuild at the same time it is doing read/write operations? How well does it perform when it is also running a replication process? Remember, the backend can also impact primary I/O performance and again, some storage systems are better architected than others.
Testing latency can often be a more accurate predictor of what you will experience in real world environments. IOPS is dependent on block sizes, which is variable and therefore ulitmiately unpredictable. However, latency is more of a constant and therefore more predictable. Storage latency simply means the amount of time it takes for the storage system to respond back to the application. Getting sub-ms response times can result in query times being reduced signficantly.
Analogy: IOPS, Latency and Throughput
I've created a simple analogy that explains IOPS, Latency and Throughput. You ordered something online and it takes two days for that box to be delivered to you. Inside that box is one large item. You then order something else online and it also takes two days for that box to arrive at your door. It is the same size box but within it are 100 small items. In this analogy the Latency is 2 days, which is the same for both boxes. The IOPS is the number of items that were delivered to you. In the first box it was just one item and in the second box it was 100 items. So the latency was exactly the same but the IOPS were drastically different!
In both cases you got your boxes within the same amount of time but the IOPS varied based on what was being sent to you.
This brings up throughput. Using the same box analogy, throughput is the number of boxes being delivered that fit in the truck. Assuming that all of the boxes are the same size, the truck can carry 200 boxes at full capacity. Therefore your throughput, in this analogy, is 200 boxes. Again, the number of items (IOPS) within those boxes will vary greatly based on the size of those items (block size).
Latency is 2 days, maximum throughput is 200 boxes and IOPS is an unpredictable variable.
Caching and Storage Performance
Caching is also important for many storage systems in order to provide good performance. However, it will depend on how cache friendly your application workloads are. Industry-wide the consensus is that you should generally get between a 60-70% cache hit ratio - so the chances are you will benefit from caching.
I believe that partitioning is a good feature for caching because it can address the mixed workload contention issue that I discussed above. I also like caches that allow you to pin data into the cache so that you can ensure high performance of a hungry application. Additionally, the larger the cache memory the more likely you will get cache hits because it will keep more data in the cache pool.
Some storage systems support only read cache and typically reads are the majority of your I/Os. However, you will obviously be creating writes as well (otherwise there would be nothing to read) and a write-back cache can be quite beneficial. Write-back caching is a bit more complicated to implement for the vendors and not everyone has this feature so be sure to ask whether their storage systems support it or not.
Storage performance testing with a cache should be done mindfully. If you are getting 100% cache hits during your testing then it is giving you a skewed view of that storage system's performance. It is improbable that you will get 100% cache hits in the real world. I do think it is important to do performance testing with caching turned on and not just to the disk drives - since caching will play a role in improving performance. Assume between 60-70% cache hits with about a 60-70% of the I/Os being from reads - as a general rule of thumb. You may have specific types of workloads that go outside of these parameters but my recommendations are pretty reflective of the broad market.
The Importance of Storage Performance
Storage performance is not an exact science since there are so many variables. Interestingly, all of the current attention on performance will help to solve many problems for the entire market...eventually.
The goal is to remove performance off the table as an issue by creating architectures with enough head room so that the storage system will never be the bottleneck. We will get to a point where you won't have to worry about storage performance testing because the storage systems will be such speed demons that it will handle anything you throw at it. Such systems do exist today but you typically have to pay a premium for them. Performance is important but what are you willing to pay for it?
In the end it is the combination of price and performance that will alter the landscape forever and not just performance alone. The good news is...this shift is occuring even as we speak.
Feedback awaiting moderation
This post has 2 feedbacks awaiting moderation...
Leave a comment
|« Dell Acquires EMC: Godzilla Marries King Kong?||Storage Performance - The Next Transformative Stage »|