Blog

Erik cries when SSMS lies

My SQL Server friendo Mr. Erik C. Darling was recently telling me about some work he did getting batch mode on paging queries. This sounded a bit odd to me because paging queries make me think of small seeks of rows against finely curated nonclustered rowstore indexes, so I asked him for a link to his blog post about it. He grumpily refused and told me to find the link myself, which I eventually did.

High End Machine Performance

Erik’s first attempt used OFFSET/FETCH and resulted in a row mode query:

SQL Server Query Plan

The clustered index scan makes this a bit of a sad paging query. In Erik’s defense, there’s a whole host of real world reasons as to why you wouldn’t be able to make the perfect nonclustered index for your paging query:

  • End users may choose to sort on many different columns and you can’t index them all
  • A key member of your Index Approval Committee is on vacation
  • You already have more indexes on the table than your number of fingers
  • You’re working with third party software which does not allow you to create custom indexes

Getting back to the query, it doesn’t look that offensive to me. The row mode sort is a parallel top N sort and the overall number of rows to return is low, so each thread can independently sort its rows and return 1000 locally sorted rows to the parent operator. This is about as good as it gets with parallel row mode sorting. This is a row mode only query so the operator times that you see are the sum of that operator’s work and its children. In terms of real work done by the query, the scan clocks in at 1.168 seconds and the sort clocks in at 0.84 seconds. The final accounting at the end by the parent Parallelism (Gather Streams) is misleading at best and an outright LIE at worst. There wasn’t 4 seconds of work done by this query. There was only 2 seconds. The red lines illustrate the problem perfectly and I won’t be elaborating further:

Erik’s second attempt uses ROW_NUMBER() and he achieves a plan with some batch mode operators using BMOR (batch mode on row store):

SQL Server Query Plan

The parallel batch mode sort works just fine here in that the single thread output property isn’t an issue. The parent operator is a batch mode window aggregate, but even if it wasn’t, the grandparent is a gather streams operator so the rows would end up on one thread anyway. Actual time statistics accounting works differently for batch mode operators: each batch mode operator only tracks its own work. In terms of real work done by the query, the scan clocks in at 1.1022 seconds and the sort clocks in at 0.892 seconds. This is quite similar to the first attempt. It could be argued that the batch mode sort is more efficient than the row mode top N sort, but I’d call it a wash considering the unpredictable rowstore to batch mode conversion overhead (which does seem to be small for this table).

Low End Machine Performance

I tested on my local machine with 64 GB of RAM which is less than Erik’s laptop. My clustered index scans took significantly longer than his, but as usual, there’s a lot to learn from low end machine performance. Let’s go back to the first reason as to why the table might not be indexed well for this particular query:

End users may choose to sort on many different columns and you can’t index them all

Microsoft presents a standard solution for this scenario: the humble nonclustered columnstore index. This will be great for my low end machine because I’ll be able to fit the new NCCI in memory. For those following along at home on their own low end machines, I created the index on every column except the Body column in a very carefree fashion:

CREATE NONCLUSTERED COLUMNSTORE INDEX ncci ON posts (
  Id
, AcceptedAnswerId
, AnswerCount
, ClosedDate
, CommentCount
, CommunityOwnedDate
, CreationDate
, FavoriteCount
, LastActivityDate
, LastEditDate
, LastEditorDisplayName
, LastEditorUserId
, OwnerUserId
, ParentId
, PostTypeId
, Score
, Tags
, Title
, ViewCount
) WITH (MAXDOP = 1)

I ran the OFFSET/FETCH query on my low end machine with MAXDOP 4 and it only took 430 CPU ms and 130 ms of elapsed time:

That’s a huge improvement compared to the 15-20 second runtime I was experiencing earlier. Interestingly, the second query (the ROW_NUMBER() approach) sticks with the parallel batch mode sort and performs significantly worse in comparison:

The key difference here is the batch mode Top N sort in the first query. Remember that the query compile process for BMOR is different than what you get when compiling in the presence of a glorious columnstore index. You can get the improved batch mode top N sort by also doing a fake join to an empty CCI table. Serious batch mode connoisseurs should be mindful of the compile differences as they seek to gain the greatest performance benefit possible from batch mode.

Final Thoughts

Friends don’t let friends be lied to by SSMS. Thanks for reading!

Lost In Costs: Part 3

Lost In Costs: Part 3


Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that, and need to solve database performance problems quickly. You can also get a quick, low cost health check with no phone time required.

Lost In Costs: Part 2

Lost In Costs: Part 2


Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that, and need to solve database performance problems quickly. You can also get a quick, low cost health check with no phone time required.

Lost In Costs: Part 1

Lost In Costs: Part 1


Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that, and need to solve database performance problems quickly. You can also get a quick, low cost health check with no phone time required.

A Little About Scalar UDFs and Read Committed Snapshot Isolation In SQL Server

A Little About Scalar UDFs and Read Committed Snapshot Isolation In SQL Server


Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that, and need to solve database performance problems quickly. You can also get a quick, low cost health check with no phone time required.

SQL Server Performance Office Hours Episode 36

SQL Server Performance Office Hours Episode 36



Questions:

* Back in the days of yore, when you worked for somebody else, did you ever become despondent with your work (or SQL). If so, how did you re-motivate yourself?

* Why have I never heard you suggest ALTER DATABASE CURRENT SET READ_COMMITTED_SNAPSHOT ON WITH NO_WAIT?

* For a table-valued function, which would *generally* be the better return type: `RETURNS @tbldata TABLE(ID INT IDENTITY(1,1), col VARCHAR(MAX))` or `RETURNS @tbldata TABLE(ID INT NOT NULL IDENTITY(1,1), col VARCHAR(MAX), PRIMARY KEY CLUSTERED (ID) WITH (OPTIMIZE_FOR_SEQUENTIAL_KEY=ON))`

* if you had to do work in another database what would it be?

* what dou you think heaven purgatory and hell are?

To ask your questions, head over here.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that, and need to solve database performance problems quickly. You can also get a quick, low cost health check with no phone time required.

A Little About Serializable Escalation In SQL Server

A Little About Serializable Escalation In SQL Server


Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that, and need to solve database performance problems quickly. You can also get a quick, low cost health check with no phone time required.

Why Partitioning Is Not A Performance Feature In SQL Server

Why Partitioning Is Not A Performance Feature In SQL Server


Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that, and need to solve database performance problems quickly. You can also get a quick, low cost health check with no phone time required.

A Lot About Multi-Statement Table Valued Function In SQL Server

A Lot About Multi-Statement Table Valued Function In SQL Server


Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that, and need to solve database performance problems quickly. You can also get a quick, low cost health check with no phone time required.

A Little About Optimizer Nudging In SQL Server

A Little About Optimizer Nudging In SQL Server


Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that, and need to solve database performance problems quickly. You can also get a quick, low cost health check with no phone time required.