Skip to main content
Dave Mason - Mastodon Dave Mason - LinkedIn Dave Mason - Codeberg Dave Mason - Counter Social

Columnstore For Real-Time Operational Analytics

SQL Server Columnstore - Dave Mason

Starting with SQL Server 2016, new columnstore indexing features allow something called real-time operational analytics. Microsoft's documentation describes it as "the ability to run both analytics and OLTP workloads on the same database tables at the same time". Additionally, it can also "eliminate the need for ETL and a data warehouse".

I tend to work mostly with OLTP environments. Many of them have questionable designs or serve reporting workloads. Not surprisingly, there are a lot of performance-sapping table scans and index scans. I've compensated for this somewhat by using row and page compression, which became available on all editions of SQL Server starting with SQL Server 2016 SP1. Could I get even better results with columnstore indexes? Lets look at one example.

Here are four individual query statements from a stored procedure used to get data for a dashboard. If you add up percentages for Estimated Cost (CPU + IO), Estimated CPU Cost, or Estimated IO Cost, you get a total of about 90% (give or take a few percent).

SQL Server Columnstore - Dave Mason

The first two statements include a nonclustered index scan and the second two both have a clustered index scan. Each of the four scans consumes a large portion of the CPU and/or IO. Two nonclustered columnstore indexes were added. Now lets take a look at the before and after images of the query plans.


Statement 1

Index scan and index seek operators are both replaced by columnstore index scans. Note the table insert operation--presumably, its overall cost would be consistent. Yet as a percentage, it goes up significantly, becoming more "expensive" by virtue of the other operations being more efficient and less "expensive".

SQL Server Rowstore - Dave Mason

SQL Server Columnstore - Dave Mason


Statement 2

As in the previous statement, index scan and index seek operators are both replaced by columnstore index scans. We also see the table insert operation increase in cost.

SQL Server Rowstore - Dave Mason

SQL Server Columnstore - Dave Mason


Statements 3 and 4

The final two statements are nearly identical, so I won't repeat them. It's a different table this time, and it has a clustered index seek and a clustered index scan. Both operations are replaced by columnstore index scans.

SQL Server Rowstore - Dave Mason

SQL Server Columnstore - Dave Mason


With the nonclustered columnstore indexes in place, the relative costs of the four individual query statements go down significantly. Summed up, Estimated Cost (CPU + IO) is down to 58%, Estimated CPU Cost is down to 66%, and Estimated IO Cost is down to 55%.

SQL Server Columnstore - Dave Mason


The execution stats look even better. Elapsed time for the stored procedure drops from 15 seconds to 5 seconds. Logical reads also dropped by 80%. And there's even more good news. By keeping a close eye on index usage stats, I noticed some of the other rowstore indexes on the two tables were no longer being used for index scans. Various other executions plans were now using the new nonclustered columnstore index for those scans instead. Many of those plans saw similar performance gains with reduced CPU and IO.

I am still somewhat new to columnstore indexes, and my sample size is admittedly small. But so far, real-time operational analytics appears to be the best of both worlds. Reporting and analytical queries that scan wide swaths of data are working well with columnstore indexes, while existing rowstore indexes provide traditionally efficient seeks. Are you using nonclustered columnstore indexes on OLTP tables too? Please leave a comment! I haven't found anyone else that's doing this yet. Happy indexing, everyone.

Comments

Post comment