Why did we take disk drives out of servers? Now we rely on traditional disk array storage and surround it with data center compute. The industry doubles processing power every 12 to 18 months and Gigabit Ethernet and Infiniband now provide 10 – 40Gig bandwidth with very low latencies. So we have lots of compute and we are swimming in bandwidth but the storage array hasn’t evolved at even 1/100 the pace? The lack of IO is killing the applications.
Now in 2012 it seems clear that solid state storage will be the solution to balance the network-compute-storage triangle and provide the IO necessary to not only virtualize the easy applications but now tackle the hard (latency-sensitive) IO bound applications that have been fine tuned to run on dedicated servers.
The easiest way for incumbent array vendors to get in the game is to use solid state in the HDD form factor (SSD) and plug them in to the traditional storage array. Yes, you can take a Fibre Channel or SAS SSD and place them in an enterprise disk array and you will most likely see a performance increase but nowhere near the datasheet expectations of the SSD. The main issue is latency. The big array has multiple layers of controllers (RAID, DeDup, shelf controllers, etc) to make the “array of unreliable HDD’s” appear reliable in the first place. These layers will always reduce the “potential” offered by the SSD and in most cases only a few slots in shelf can be filled with SSDs before the shelf controller is saturated. This obviously creates a strange looking array if you tried to fill it will all SSD’s as 80 percent of the slots would be empty because the array will be max’ed at the limits of the performance it was designed to support At the same time the price paid for these SSD’s from the array vendor is much higher than the street prices you will find online. It’s all about protecting those margins.
It is like buying the engine from a race car and putting it in a minivan, sure it will go faster, but not much faster, it just isn’t built to handle it.
Lets look a little deeper at why those SSD’s don’t really fit in today’s solid-state storage Array. First all of these devices use disk protocols:
- SAS (Serial Attached SCSI)
- Fibre Channel (really Fibre Channel Protocol for SCSI) SCSI
- Perhaps SATA (very slow)
Some of these newer interfaces may be faster than older disk interfaces, but they are still disk interfaces and limit the bandwidth and increase latency when used with SSDs that are capable of so much more performance.
Stepping back a bit - if you only put a handful of SSDs in a disk array they are usually used as a cache or tier. This can be a valid strategy, but one that assumes a very good cache hit rate. In a simple case where a single cache miss might add 10ms of latency - it only takes a 1% miss rate to cut the performance of a single threaded application in half.
If you take a look at the specs of any enterprise storage array, it will usually have more front side bandwidth (facing the hosts) than it has backside bandwidth (facing the disks). This is a perfectly good choice for these systems since they all have large DRAM caches that serve data to users and stage writes into the disks. The disks are rarely run at full utilization even when under high user load. In fact, I recall being told the difference between a consumer RAID controller and an enterprise array was the consumer device had more bandwidth available to the disks than to the host while the enterprise array had more bandwidth from the cache to the users than it did to the disks. While this was probably a good design for disk systems I will show you that it’s not true for flash, and as a result the legacy disk array simply is the wrong place to put flash storage.
Remember that the reason you were considering putting SSDs in the storage array is the existing cache isn’t performing and the disks are running at full utilization. By implication this means the disks are running out of IOPs not bandwidth, so there still should be capacity for pref-etching from the SSDs, or maybe not? Remember, if the access patterns were such that IOs weren't pre-fetchable in the first place, then you are still going to take large latency hits at the start of every access to be loaded in the SSDs. If you are lucky you will be performing large accesses to allow some amortization of the access cost, but you must be reading multiple megabytes to overcome the pre-fetch cost. And if you are reading multiple megabytes sequentially then you probably didn't need the SSDs in the first place.
It is also likely that your SSD, like some PCIe cards, has its own internal RAID because it is possible for entire blocks to fail and on rare occasion even entire NAND die. Despite all the fancy feature names and white papers predicting ominous consequences if you don't use a specific vendor drives with their Super Extra Special Enterprise Strength Enhanced Storage Enabled Safety Engine aka (SE)^5 (tm), recovering from failed flash blocks is little different than remapping failing disk sectors which disk drives have been doing for decades. At Violin, we have a whole host of mechanisms designed to ensure every piece of data you give us is protected as one would expect from a piece of enterprise storage equipment.
But back to the beginning, the SSD has no idea if it is being used in a RAIDed array or not and must implement its own internal RAID to recover from such failures to provide the sort of data protection required for enterprise storage. So like the PCIe card, when you use the SSD in a RAIDed storage array you are paying twice for RAID.
Another issue is disks only have a limited amount of remapping space available and most arrays controllers are designed to handle a head crash in a disk where all the bad sectors are in the same place. They aren't designed to deal with a drive that can suddenly have thousands, or even millions bad sectors appear randomly all over a drive. Unlike a head crash where all the bad sectors are in the same place, when a block, or worse a whole flash die fails it will contain sectors with logical block address spread over the whole address range.
Like the PCIe card is to the server, the SSD is really is just an expensive way to try boost the cache performance of an enterprise disk array and provide the incumbents an incremental introduction to the capabilities of solid state while maintain their existing profit margins and data center footprint.
If one were to take the next logical step – the flash Memory Array (Violin 3000 & 6000) is a “ground up” design to aggregate flash in most economical and reliable way to provide a path to low latency “memory speed” performance in a footprint measured in 3RU (rack units) rather than multiple racks of short stroked hard disks.
Putting SSDs in legacy arrays is just like putting a race car engine in a minivan, the only bang you are likely to get for your buck is the sound of the under-built transmission exploding the first time you step on the gas.