Insights, Best Practices and Forward Thinking from the Customer Facing Team, Solution Architects and Leaders on Extreme Performance Applications, Infrastructure, Storage and the Real-World Impact Possible

Why is flash good for data warehousing?

by VIOLIN SYSTEMS on April 26, 2013

Flash is about latency and IOPs so why would it be good for Data Warehousing or Business Intelligence?

Excellent question.  Yes, the typical marketing and wow-factor stats around flash are based on latencies and IOPs (Input Output Per Second).  Data warehousing (DW) and Business Intelligence (BI) is normally a throughput game, so what gives?

First, let’s quickly define DW and BI, or at least expound on the associated tasks and issues. The challenges, in general, lie in two areas:

1.            Getting data into the system (ETL’s or importing data)

2.            Getting data out of the system (Reports)

Reports run faster when the reads are large blocks of sequential IO.  But, the engine can only read sequentially if the writes were sequential so there is the rub.  Extra work is required to create sequential writes (during the data loads) and keeping the reads sequential during the reports.

Let’s look at getting the data into the database.

Loading new data, in its simplest form, requires the following steps:

1.            Loading data into a staging table

2.            Making any necessary transformations

3.            Sorting the data (to achieve logical sequentiality)

4.            Loading the data into the final partition (to achieve physical sequentiality)

This last step (final partition load) is recommended to be with only one core due to the fact that many concurrent requests for new extents (database pages) will cause a random placement of the actual data. This is a limitation of spinning disk. With Violin’s memory-like, all-silicon architecture, the logical fragmentation of data is irrelevant to its retrieval speed.  Loading data is transaction log bound (versus the data page writes as logs are synchronous and data pages are asynchronously written) which is bound by the latency of the writes to storage.  So, latency is almost a 1-to-1 performance booster while the memory-like architecture allows for certain steps to be dropped.  It is very common for ETL’s to go from 4 hours to 15 minutes or 20 hours to 4 hours.  And, due to the irrelevance of logical fragmentation, you can update data without worrying about breaking the sequential-ness, you can load many chunks in parallel and you do not have to worry about partition/LUN/locality mapping issues that might affect speed.  It is the latency and memory-like architecture that help here, not IOPs.

Now let’s look at using the data for reporting workloads.  Again, it’s not about IOPs.  Throughput is still the key metric for DW performance.  And you might say that disk is great at throughput.  And you would be right.  If all of your data has been laid down in a sequential pattern and you only read it sequentially then you can get some great speeds with spinning media.  But, modern DW & BI systems usually serve many users.  These many users want many things at the same time which means that the IO access pattern is very likely to be random, not sequential.  Gone are the days of running a handful of reports by start of business and then the reporting system goes idle.  Reporting systems are 24x7, they service adhoc SQL, their queries run with parallel threads, they serve many users, and are hit by systems with many cores.  This does not bode well for allowing a disk to use one cylinder at a time.  Violin’s memory-like, random access, architecture serves any IO, anywhere in the system, in the exact same time.

Have you ever launched a report and observed the CPU at 90% idle while the report chugs for 20 minutes?  When the scheduler for each core wants more data for your report, it will launch an asynchronous IO to storage to retrieve the needed data and then put the task on the ‘waiting’ queue and go idle until the IO returns.  The only way to get the report to go faster is to return the data faster (latency) or add more threads.  Modern databases don’t allow for more parallelization (for each query) than the number of logical cores, so realistically the latency of the data reads is the only real way to make things go faster.  So, again, it’s not so much the IOPs that help data warehouses running on Violin all-flash storage arrays but the latency and memory-like architecture that allow for any number of databases, tables, users, queries, indexes, virtual machines - you name it - to run at the exact same stable, predictable, crazy–fast speed.  With one fell swoop, you’ve migrated your bottleneck back to your CPU, where it should be.  And, you’ll now be using all the money you’re spending on the licenses for all those cores.

Happy reporting.