Insights, Best Practices and Forward Thinking from the Customer Facing Team, Solution Architects and Leaders on Extreme Performance Applications, Infrastructure, Storage and the Real-World Impact Possible

NAND: A Brief Background

by VIOLIN SYSTEMS on October 7, 2013

NAND flash (solid state) storage is the new rage in IT.  As only a new technology can, it is quickly becoming the focal point in new solutions architecting.  Where once we had to manage hundreds or thousands of disk drives, dozens of LUN groups, many shelves, RAID types, unit allocation, hot spots, complex software and tiering we can now place all of our data into one, or a small few, all-flash arrays and receive amazing speed with little to no tuning or advanced planning.  Even for systems with a moderate I/O workload this new technology can be cheaper once the software elimination, power reduction and administration time is factored in.  But, not all flash storage solutions are the same.

NAND flash is a somewhat new technology and it brings with it a new set of terminology, a new set of pros and most certainly a new set of cons.  It is the new gold rush and new flash-based storage vendors are popping up by the dozens.

Reads and writes to flash are incredibly fast due to parallelization.  Each data block written to a flash device is stripped down to the bit level and spread out over many flash chips.  The more chips and controllers that handle this load the more truly in-parallel the I/O and therefore the lower the overall latency of the I/O.  This places a huge emphasis on both the amount of NAND flash in a device and the overall controller logic.

NAND flash, as a technology, is incapable of in-place updates or deletes.  Any update is in-fact written to a wholly new location and the old location marked as garbage.  In order to free up this garbage space the whole flash block must be drained of power.  This operation is up to 30x slower than a read or write.  The individual controller logic surrounding garbage collection is one of the top differentiators between flash vendors and one of the reasons why performance over time can slow down.  Any time a write must wait on a garbage collection process it is called Emergency Garbage collection – and the resulting drop in performance is known as the “Write Cliff.”

To avoid Emergency Garbage collection, each flash vendor will pre-allocate a certain amount of flash to direct all writes (either new data or forwarded updates) when the array is first deployed.  This amount of flash is hidden from the applications and is used to absorb new writes while other flash is in garbage collection mode.  The amount of pre-allocated flash, along with the speed of the garbage collection process, will determine how fast and long a workload can write before the device throttles down.  Once again, the amount of flash deployed (at a single awareness level, multiple SSDs in a single chassis are unaware of each other and therefore do not count here) and the controller logic are key to the overall speed and I/O profile.

Any time a write occurs the operation itself could negatively influence any geometrically adjacent cells so error correction must be run on not only the current I/O but any other possibly effected bits of data.    This makes spreading the data out very important and also allows for additional acceleration if the processes are done over larger blocks of awareness and done via custom versions of data protection (normal RAID 5 doesn’t worry about the geometric locality of bits, etc).  Therefore the more time spent developing custom algorithms the more likely the vendor is to have come up with a much faster and stable implementation.

And finally, wear leveling is a huge factor in the life span and performance of NAND flash storage devices.  There is a limit to the number of times each cell can be read from or written to and poor algorithms of data distribution, error correction or wear leveling can lead to quick deterioration of space or an inferior speed profile.

All of this wraps up into a brief summary that the amount of flash at each awareness level and the logic used to perform these tasks is one of the largest differentiators between flash deployments.