Tell It To The Judge
A common dilemma is when you have two date columns, and you need to judge the gap between them for something.
For instance, say you have a table of orders and you want to figure out how long on average it takes an ordered item to ship, or a shipped item to be delivered.
You’re not left with many good ways to write the query to take advantage of indexes.
Let’s have a look-see.
Iron Mask
We’re gonna skip the “aw shucks, this stinks without an index” part.
CREATE INDEX gloom ON dbo.Posts(CreationDate, LastActivityDate); SELECT COUNT(*) AS records FROM dbo.Posts AS p WHERE DATEDIFF(YEAR, p.CreationDate, p.LastActivityDate) > 9;
Because it still stinks with an index. Check it out.
Though we have a predicate, and an index on both columns, we don’t have anything to seek to.
Why? Because our predicate isn’t on anything that the index is keeping track of.
Indexes don’t care how many years, months, days, hours, minutes, seconds, milliseconds, or microseconds difference there are between these two columns.
That’d be a really cool kind of index to have for sure, but insert a shrug that fills your screen the way dark matter fills the universe here.
All we got is workarounds.
Another For Instance
We can use a computed column:
ALTER TABLE dbo.Posts ADD despair AS DATEDIFF(YEAR, CreationDate, LastActivityDate); CREATE INDEX sadness ON dbo.Posts(despair);
The result is something we can seek to.
Which is probably the type of plan that you’d prefer.
Thanks for reading!
Going Further
If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.
For the sake of completeness: you could add a “WHERE sadness > 9” to your CREATE INDEX statement, if you only query for those rows (and nearly never for <= 9). I guess, that only maybe 5 % of the orders will take more than 9 hours to ship, so this would reduce the size of the index by 95 %.
Important: if you filter for a @variable in your query, a filtered index can only be used, if you specify an OPTION(RECOMPILE) too (which could have some drawbacks, if you run this VERY often (every few seconds) and not only a few times per day).
For the sake of completeness, you should test these things out ?
Sorry, I forgot this, but you could filter for WHERE DATEDIFF(YEAR, CreationDate, LastActivityDate) > 9.
With a little bit luck (would need to test it but have no fitting PC at the moment) it would use the index, even if we filter for despair > 9, otherwise we had to modify our query.
Really? When will this product finally get out of beta phase? Is it too hard do add a row to a b-tree when the value satisfies the index condition, or remove it accordingly?
I’m sure the problem isn’t with adding or removing “a row” — but hey, lemme know when your database is ready. It sounds like you’ve got it all figured out.
Hi Erik,
always enjoy reading your posts! And thanks a lot for the all the time and effort you take to enlighten your colleagues around the world!
Just a quick Question: since when is it no longer required to specify a computed column as ‘persisted’ for allowing it to be indexed?
Hi! Thanks — happy to be reaching people! Funny question: it’s never really been necessary.
Computed columns don’t have to be persisted to index them, or have statistics on them.