I have gotten quite a few questions lately about the types of flash we do and do not use here at Violin: SLC and MLC as opposed to Enterprise MLC (eMLC). There are a lot of vendors out there trying to convince people that need eMLC parts or Enterprise MLC SSDs, and that regular MLC just can't cut it in the data center.
We will also look at what people think are the major differences between "consumer/commodity" flash usage and "Enterprise" usage that would make them need to use eMLC.In order to explain why we at Violin Systems can use commodity MLC in the data center, we need to explore the differences between regular MLC and eMLC so that we can understand what it means to call a flash part or an SSD to be "Enterprise."
In the next series of blog posts, I will compare hypothetical MLC and eMLC parts and commodity and enterprise SSDs that show how the “e” version gets some advantage over a commodity grade part, and how a Violin flash Memory Array can get the same advantage while using commodity parts. I will also show how we can get certain advantages that eMLC and E-SSDs simply can’t achieve.
There are a number of properties that one can use to describe flash, or flash storage. Some of these properties are fundamental properties of the flash itself, some are properties of the controller, or the hierarchy of controller’s in an array, but most are blended properties of some or all of them.
One of the most important things to understand about eMLC flash is that there are many different kinds of "enterprise" MLC flash, however in most cases the there is little or no difference in the actual flash cell of an MLC part and an eMLC part. The difference between an MLC die and an eMLC die may be in how many flash cells are on the die, or exactly how the internal controller operates the die. The difference between a regular MLC SSD and an Enterprise MLC SSD, may be that the Enterprise MLC SSD, uses eMLC parts, or it may be that the controller in the Enterprise SSD uses regular MLC but in such a way as to extract more life from the regular MLC parts. From here on, for compactness I will use eMLC to mean eMLC parts or SSDs/PCIe cards, etc., using “enterprise” controllers.
In order to understand the difference between commodity MLC and enterprise MLC, it is necessary to understand what happens when flash is used up to the limits of its datasheet. Let’s use your washing machine as an example. It likely has a one-year warranty that is almost certain to fail after 366 days. Applying that analogy to MLC, when a block of MLC flash with a datasheet life of 3K program/erase (P/E) cycles reaches 3,001 P/E cycles, the only difference that you as a user of the flash might notice is that it will have gotten a little faster.
Yes, I said faster. One of the interesting properties of flash is that as it "wears out," the process of programming the flash happens faster than it did when it was brand new. This is because the physical process of flash "wearing out" is a result of electrons getting stuck in the walls of cell, causing them to become slightly permanently programmed. This causes it to take less time to fully program them. Also, it takes longer to erase the flash, while read times are largely unaffected. Violin flash Memory Arrays hide the effect of writes and erases. This means the older your flash gets, the faster it is, as opposed to other systems where the older it gets, the larger latency spikes get.
Flash doesn't magically fall over and stop working at the end of its datasheet "lifetime." The datasheet lifetime isn't about when the part stops working, it is about what the flash manufacturer warrants the part for, based on JEDEC specifications for data retention under a certain testing profile at certain given temperatures as well as what the maximum number of blocks that will have gone "bad" for certain very specific definitions of the word "bad.”
What is the practical upshot of all this? It is that the behavior of flash is actually a function of, and a trade-off between the effective density, performance, data retention and endurance of the flash part. If the tradeoffs and behaviors that matter to you are different from those in the part spec, you may be able to get performance and behaviors from the part that is also different from the part spec.
These posts need a few caveats and explanations up front:
- I am not saying that there are no use cases in the data center or elsewhere that need eMLC *parts* or Enterprise SSDs, simply that we at Violin don’t need to use eMLC parts.
- Violin has worked under NDA at one time or another with basically every flash vendor on the planet, so I am not going to use actual flash datasheet numbers for my examples. All the hypothetical flash parts in these examples really are just that: hypothetical. The exact numbers shown and the mechanisms described are "accurate" but not "actual", or as they say in the movie biz: All flash parts appearing in this blog post are fictitious. Any resemblance to parts, real, or vapor, on any past or future roadmap, currently in production or EOLed is purely coincidental.
- I am not telling you anything you could not find out if you read lots and lots of publicly available datasheets, research papers and conference presentations, etc. I am just pulling it all together in one place and showing how it fits together.
- The key take away from these posts should be that it is because we build the whole system starting from individual flash parts and putting them under multiple levels of different controllers, we can achieve equal or better results with commodity MLC that others can only achieve with eMLC.
- There is no implication that Violin is currently using or will use all of the methods I am going to describe in these posts. Nor am I going to give a complete list of all the ways that regular MLC can be made to work in a data center. We do not need to use all the methods I will discuss to achieve our goals today, and in the future if we need to do more we might use methods other than what I am going to describe here.
Let’s consider several pairs of hypothetical flash datasheets, and I will explain all the different ways that they might come about. For the purposes of this discussion, I will consider both MLC and eMLC parts to be offered by the same manufacturer, from the same process technology. That is, the actually flash die is basically the same size and composition for both parts.
Example 1 : “The parts look the same but aren’t”
|Retention at Rated P/E Cycles||1 year||1 year|
|Required ECC||30 bits (per 1K)||30 bits (per 1K)|
|Program Time (ave)||1.5ms||1.5ms|
|Erase Time (ave)||5ms||5ms|
|Minimum Blocks per Die||4096||4096|
This pair of datasheets differ only in the rated number of program erase cycles. There are a couple of ways for this to occur. One is that the MLC and eMLC parts are actually different parts.The fabs are always experimenting with different layouts and designs each time they switch to a new process technology. Because of the nature of chip fabs, where wafers come in fixed sizes and there are always some number of defects on the wafer, the bigger the die is:
- The less die will fit on the wafer
- The greater the chance of there being a defect somewhere an a given die
So the smaller the dies are, the more working dies the fab will get from the wafer. This is called the yield: http://upload.wikimedia.org/wikipedia/commons/0/03/Wafer_die%27s_yield_model_%2810-20-40mm%29_-_Version_2_-_EN.png
However, for flash, the smaller the die and the features, the less reliable the part will be, because smaller features mean:
- Less electrons are stored in the cells
- The cells are closer together, so there is more cell-to-cell disturbance
- The less space there is between the high voltage amps inside the die and the cell arrays, the more disturbance there may be on some parts of the cell array.
So the fabs have to balance quality with yield because yield=cost to the fab and, thereby, price to you. So while a fab may sell two flash parts that are referred to as X nm parts, that just means that the smallest feature is X nm. In the eMLC part, it might be that the flash cells are X nm by 1.5X nm in size and in the MLC part, they are X nm by 1.45X nm. Or the eMLC part might have a 50X nm gap between the amps and the cell array and the MLC part only has a 20X nm gap.
There is a lot on a flash die besides just the cell array: http://www.toshiba.com/taec/news/press_releases/2009/images/32nm_3bit.jpg
Small changes to all those other areas of the die may have a big change to both how good the part is but also to how big the part is, and it only takes a small change in size to have a big change in how many parts fit on the wafer.
So sometimes the only way to get better flash is to make the die bigger and that makes them more expensive. But as we will see in the next installment, making the flash bigger is not the only way to make flash better.