Insights, Best Practices and Forward Thinking from the Customer Facing Team, Solution Architects and Leaders on Extreme Performance Applications, Infrastructure, Storage and the Real-World Impact Possible

Garbage Collection & XtremIO – Fiction & Fiction: Part II

by VIOLIN SYSTEMS on December 9, 2013

In Part I of this series, I told you to take note of this slide and that we would revisit it to show just how misleading the XtremIO launch really was.

garbage_collection

Now, before I get into what this says about XtremIO, let me first explore what it does not say about “Startup,” who, from context, appears to be Pure Storage. Pure Storage is my competitor and I am not going to tell you I think their product is even remotely as good as ours (I’m sure they would say the same thing), but that doesn't mean that I will let EMC make itself look better than Pure when I would argue that it is the other way around.

Let’s examine the numbers EMC presents and separate fact from fiction.

The User Writes Per Day of 45K appears to come from a 70/30 read/write mix at 150K IOPs.

Leaving aside that 150K mixed IOPs is the MAXIMUM mixed spec sheer performance of an XtremIO brick, to use these numbers would imply running the array at 100% 24/7/365, and we all know that doesn't happen. That probably adds a factor of at least 2-4X additional endurance which would take the “Startup” product to 4.6 to 9.2 years of life... and so really I could stop right here.

But there is so much more  to discuss. Why stop when it just started getting interesting?

Why do they only credit “Startup” with a write reduction from dedupe at the end of the slide rather than up front?

Hard to say. At first, I thought it was so you wouldn’t notice that they gave themselves credit for 5:1 dedupe of 100% of writes (not all blocks, but all writes, and I’ll explain the difference at the end for those who care) but only give “Startup” credit for the 5:1 for 75% of all writes. This would have resulted in hiding the fact that they slipped in an extra factor of 2X. But, they did the math as if it was 5:1 for 100% of the writes, so it looks like they just did it to make themselves look artificially better.

Next, let’s look at the Data Protection Overhead of 2X.

Well that’s just wrong. After criticizing “Startup” for using a logging system (a criticism I totally agree with) which does full stripe writes (which I don’t criticize them for), they then try and charge them a write overhead that would only be appropriate if they were running RAID-1, except that they aren't! They are running 24+2 protection just like XtremIO and *unlike* XtremIO, “Startup” does not do read-modifiy-writes as XtremIO does. For XtremIO, this not only increases the write load but adds the risk of RAID inconsistency. So really, the Data Protection overhead for “Startup” should be 1.1X, which would take their endurance to 4.5 years.

Again we could stop here, but we are on a roll; so let’s keep rolling.

Next, we come to Garbage Collection Write Overhead.

There are two ways we can interpret this: either as being what it says it is "Garbage Collection Write Overhead", in which case the XtremIO number is not N/A as they now freely admit, or we could read it with an implied “System Level” before it, which would make the XtremIO number correct, but would make a different number in the “Startup” column inaccurate.

You see, when an SSD vendor produces the “Drive Writes Per Day” rating of an SSD, that is the number of USER writes, but the actual wear on the flash comes from the USER writes PLUS the internal Garbage Collection writes. But one of the reasons that “Startup” uses that logging file system is to prevent the random writes to the SSD and reduce or eliminate the garbage collection performed by the SSD. So if EMC wants to keep its N/A in its column, then it has to triple the Drive Writes Per Day rating in the “Startup” column, and that would take the endurance to ~7 years.

So why would I go and make my “Startup” competitor look good? Well, I didn't. I just made them not look artificially bad. But now let us take what we have learned from this exercise and make EMC look realistically bad.

cMLC vs eMLC

EMC just demonstrated that they don’t need to use that eMLC that they are certainly charging you more for to meet their endurance needs. Are you planning on keeping your XtremIO array for 144 years? Are you even going to keep it for 7 years, which is the endurance life it would have with cMLC drives? Are you going to keep it for 14 years, which is the endurance it would have with cMLC drives as soon as they get around to the release that doubles the size of the SSDs they use?

So, if endurance is not the reason for using eMLC, then what is? Perhaps it is for the advanced garbage collection in an enterprise MLC SSD. You know that thing they said they didn’t need to do?  Perhaps they need eMLC drives for their garbage collection prowess to keep up with the high, sustained write load they deliver. You know, 9,000 writes per second or a whopping 375 sustained writes per SSD or 1.5MB/s per SSD. Yes ,that’s MB not GB, 1.5MB/s per SSD.

Remember all that talk about how bad system-level garbage collection was because it would suck away the performance? Remember how proud of they were of the HALF A TERRABYTE of DRAM in the two controllers so they didn’t have to read or write meta data to the SSDs (another claim they have altered, which I will get to in yet another post)?

I mean those sound like great advantages, if only it made them faster than “Startup.” But, it doesn’t make them faster! Apparently, they are using so little of their back-end I/O, one wonders why they would accept the expense and limited selection of eMLC SSDs instead of using far more cost-effective cMLC SSDs?

Perhaps it is because eMLC SSDs are always “consistent and predictable,” at least that seems to be the favorite phrase they use. As we will see, the only thing “consistent and predictable” is their use of said phrase: "With XtremIO, performance is always consistent and predictable because garbage collection is handled in a very novel way, only possible with XtremIO’s unique architecture."

EMC and the "dirty" issue of garbage collection 

So let’s discuss the issue of garbage collection – and how XtremIO is the only all-flash array that requires no system-level garbage collection, yet maintains consistent and predictable performance:

  • "Over tens of thousands of hours of rigorous testing our SSDs rarely “hiccup” (and if it does happen, our dual-stage metadata engine and XDP handles it efficiently) providing an XtremIO user the industry’s most consistent performance over years of heavy use."

[As a side note, if you count the time by adding up each SSD tested, then 2 Xbricks tested for a month is over 30,000 hours or “over tens of thousands of hours”, somehow a month of testing doesn’t sound very rigorous]

  • “The answer is that modern SSDs are a necessary, but insufficient technology to enabling consistent and predictable performance.”
  • “And it makes XtremIO arrays the most consistent and predictable performers on the market."
  • Their competitive sales sheet for Violin says: “Violin’s performance in real-world scenarios with mixed reads and writes and various I/O sizes is unpredictable and inconsistent due to Violin’s flawed garbage collection” and “XtremIO’s performance is stable over the widest possible range of operating conditions
  • Their blog appears to back this up with the graph below and the claim that: "If you visualize how this looks when testing array performance, XtremIO has no wide swerving “S” curve like below, where IOPS suddenly drop and latency suddenly increases:"

blog_2_image002

So is the performance of XtremIO “consistent” AND “predictable”?

Is it even “consistent” OR “predictable”? 

Is it “the only all-flash array that requires no system-level garbage collection yet maintains consistent and predictable performance”?

Does it provide “the industry’s most consistent performance”?

Does it never happen that “IOPS suddenly drop and latency suddenly increases”?

Is their mixed R/W performance “stable”?

Stay tuned for answers in Part III.