Insights, Best Practices and Forward Thinking from the Customer Facing Team, Solution Architects and Leaders on Extreme Performance Applications, Infrastructure, Storage and the Real-World Impact Possible

Flash in the Data Center - Part 3 - What about PCIe Cards?

by VIOLIN SYSTEMS on September 30, 2011

 

Part 1 (Garbage Collection) and Part 2 (Commodity SSDs)

What about PCIe cards?

Another option is to pack as much flash as possible onto a PCIe card to sit in a high speed slot on a server.  Because of their much higher interface speeds, PCIe cards have much better performance than your typical commodity SSD, but face their own unique issues. To speak with the Operating System (OS)  PCIe cards require specialized software drivers.  With some cards these drivers are so heavy-weight that their vendors don't even call them drivers anymore but try to convince you they are a great value added software layer. That might be true except for the fact that those cards gain their performance by stealing CPU cycles from the core that’s hosting it, the same core running the business software your trying to accelerate.  At the same time you may be paying software license fees (per core) and now have to evaluate what is serving the application vs what is needed to keep the PCIe card running.  At the same time you will require many more GB's of DRAM to handle the metadata for the PCIe cards.

Consider the following:

  • In most cases a single PCIe flash card costs 3 or 4 times that of a server itself
  • Evaluate the  CPU cycles a PCIe card with a heavy driver steals from your server
  • Now look at the full cost of the server and software licenses
  • Then look at the cost of the PCIe card itself (in perspective)

What was the value proposition again?

To be fair, card vendors have worked to lessen their driver footprint and some have even begun to show smaller write cliffs than the commodity SSDs, but it's still there.

So do PCie cards have a place in the data center when there are single points of failure everywhere?  Some of the higher end cards have started touting they have spare chips to swap in if they detect a failing component, some even have on-card RAID in the event of a spontaneous die failure. Such protections are of little use if the single controller managing the flash fails or crashes.

You can run multiple PCIe cards in a server but you need to choose between performance and reliability as  multiple cards typically run in a RAID-0 configurations for performance. If you wanted actual RAID data protection, you would have to use multiple cards (assuming you have enough PCIe slots) and  were willing to pay an extra ~50% (assuming 3 RAIDed cards).

Then there is the fact that a flash PCIe card is locked in a server.  You lose access to that  data  if the server crashes. So if you actually *needed* that data, then you don't have much choice but to mirror your servers, pay a 100% overhead and taking the performance hit for mirroring. Even worse you may be paying more than 100% overhead if the card offers some on-board RAID feature that you don't need because your already mirroring for availability reasons.

Add to this the inability to service a PCIe card in a running system and it seems clear that PCIe cards are acceptable as a memory extension technology.  Reliable cost effective data storage - probably not.   Properly deployed flash is a strategic data center resource that can't be locked in a single server - it needs to be shared.

So the answer must be in the Storage Array's themselves....   (part 4)