CHECKDB Bug With SQL Server 2017 And ColumnStore Indexes

CHECKDB Bug With SQL Server 2017 And ColumnStore Indexes


Thanks for watching!

Video Summary

In this video, I delve into an interesting issue that still seems to be present in SQL Server up to version 2017 CU14, specifically when running DBCC CheckDB with the “physical only” option on databases containing columnstore indexes. While it’s not a catastrophic bug, encountering these errors can certainly be perplexing and time-consuming. I meticulously document my findings by rebuilding both the table and index, ensuring that all other potential issues are ruled out. The video walks you through how running DBCC CheckDB with physical only reveals numerous red text errors, which are confined to just one table within the database. Despite these error messages, a simple SELECT statement on the problematic table executes successfully without any issues, leading me to conclude that this might indeed be a bug rather than actual data corruption. I strongly recommend running a full DBCC CheckDB and DBCC CheckTable in addition to the physical only option for thorough verification.

Full Transcript

Howdy folks, Erik Darling here with Erik Darling Data, still, apparently that’s still a thing, trudging along all these many months later. And I’m recording this video basically to document what I think is still a bug in SQL Server up to SQL Server 2017 CU14. No, it’s not the most like, awful incriminating, like, like, like results destroying bug in the world. But it can be pretty confusing when you run into it. Because it happens when you run DBCC CheckDB with physical only on a database that has a columnstore index in it. Now, I’ve got to do some due diligence here. And I have rebuilt the table and I have rebuilt the index. And the version of SQL Server that I am running on is indeed 2014 CU14. This has been patched for a while now. This isn’t a fresh patching at all. This computer has been rebooted about a bajillion times since then. So there’s nothing like just waiting in the rings, like, just like, maybe he didn’t do something right. I’ve done most things right here. Now, when I run a regular CheckDB, when I run a full CheckDB on this database, right, the whole deal, not just physical only, this completes successfully. You can see down here that it ran for about, oh, gosh, there we go. About two minutes and 43 seconds. So that’s, that’s fine. That’s good. And when I run CheckTable on this, on this table, where the, where the problem happens, this also runs fine. So DBCC CheckTable on users underscore CX also completes successfully and doesn’t throw any errors. When I run DBCC CheckDB with physical only, I get a whole lot of red text. Yeah. And you can see that when we zoom in a little bit in here, you can see that all of the errors that it found are only in this one table. All right. And if we scroll through all the errors that it found are only in this one table, right?

All of the error, all the object IDs, that’s all this one seven, whatever, you can read it later, ends in 51. And the full message is looks like this. Offroad data is referenced by blah, blah, blah, blah, blah, blah, blah, blah, blah, blah, blah, blah, blah, blah, blah, blah, blah, blah, blah, blah, blah, not seen in the scan. Now, this matches exactly the text of a known issue that was supposedly fixed in cumulative update 11. We scroll in here, see you 11 2017. and if we read the error message that it references, this looks a whole heck of a lot like the one that we just saw in Management Studio. So this is still happening, and I mean it’s not the worst thing in the world. Again, just if you ever run into this, it’s a thing. Now if I take that number, this object ID right here, 17 yada yada, and I go look at what it is, it is indeed that table, users underscore cx, which is a copy of the users table in the Stack Overflow database that I have put a clustered columnstore index on, and we can see that the only index on this table right now is indeed that clustered columnstore index. I don’t have any other indexes sitting around on here. That is the whole kit and caboodle. Now what’s really funny, and what I think kind of proves that there’s no actual corruption in here, is that when I select star from that table, the query executes successfully. This runs for a while too, returning all those rows. 2.4 million rows, about 33 seconds, and so this runs for that whole time, and the query plan you can see reads, it scans that clustered columnstore index, so it’s not like sneaky reading data from anywhere else.

It’s all coming from right there. So all that to say this, if you are hitting error messages from running dbcc checkdb with physical only, and you’ve got columnstore indexes, don’t worry. I don’t think you actually have corruption. You might. You might. I don’t, but I think that that’s the bug that I’m hitting, and you might too. So to be extra safe, if you do find that physical only throws error messages on your columnstore indexes, run a full checkdb. Don’t just run with physical only.

Run dbcc checktable, run dbcc checkdb, check alloc, whatever file group, whatever else you might do. And maybe you’ll figure out if you do or don’t have corruption in that way. But for now, if you hit it, and you do your due diligence, and you check those other things, and you still find that you don’t have corruption, it’s only with physical only, well, join the club, pal. Yeah, it’s a bug. It’s a bug.

Fix your bug. Fix your bug. Anyway, again, I am Erik Darling with Erik Darling Data. I hope that… Is this how few people are using columnstore? It takes me to run checkdb with physical only to find a bug? That hurts. That hurts my feelings. I’m sure it hurts columnstore’s feelings, too. Anyway, goodbye.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 25% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.