Dates Aren’t Strings

Seriously.


Video Summary

In this video, I dive into a common mistake that many developers make when working with dates in SQL Server: treating dates as if they were strings or other data types. I share my experience from years of client work where I’ve seen this issue repeatedly and emphasize the importance of keeping dates as dates to avoid performance pitfalls. To illustrate the point, I demonstrate how converting a date column to a string can severely impact query performance by forcing full table scans instead of allowing efficient index seeks. The video also highlights why using functions like `CONVERT` or casting dates to other types is not only unnecessary but can lead to suboptimal execution plans and wasted resources.

Full Transcript

Oh boy! Erik Darling here with Erik Darling Data. And I realize, well, I realize several things. Sometimes I realize several things a day. It’s a bad habit that I’ve gotten myself into. Too many realizations is overwhelming. But my YouTube channel has been a little bit quiet lately. I have been very busy with client work, which is, you know, the blessing and the curse of consulting. When you’re not that busy, you have lots of time to record videos and blog and work on training and presentations. But you’re not getting paid. When you’re working with clients and getting paid, you don’t have a lot of time to do other things. So, one must balance. And one must balance frugally. Or else one… I don’t know. Actually, I don’t know because I haven’t unbalanced that badly yet. But there are a few videos that I’ve been meaning to record and I’m going to hopefully get through all of them today because I have nothing else to do. Another realization, this is the first time in a year. I’ve been recorded history that I’ve been recorded history that I’ve appeared on camera wearing anything other than a black t-shirt. I am currently wearing an adidas sweatshirt. There you go. It’s about as good as it gets for seeing my downstairs mistakes. And that has absolutely nothing to do with anything. So, that’s fun. Anyway, I wanted to record this video because in all my vast, vast history of client work, I have seen people do this so many damn times that I cannot even begin to count. And I guess if I were to make one point up front in this video, it’s that dates are not strings. Dates are not floats. Dates are not integers. Dates are not… I don’t know. What’s another data type? Let’s just… Binary. Let’s just say dates are dates. Dates is dates. Stop messing with them. Stop casting them to other things.

And converting them to other things. And writing scalar valued functions to change the way they look so you can compare them to other dates and make those other dates look the same via various cast converts and functions. It is a waste of time and resources. So, to prove this to you a little bit and to hopefully show you why you shouldn’t do this because it is a terrible idea. I have created a super helpful index on the votes table which is the largest table in the stack overflow 2013 database. So, the first thing that I want to show you is that when I run this query, these both return the same thing. If I convert…

get date to a string and do some fuzzy date math on it, I will get 2019-12-19 which is coincidentally yesterday. Ha ha ha ha ha ha ha! If I run convert date and then do the same thing, I will get 2019-12-19. The thing is, with SQL Server, when it comes to dates and date times and even date time too, the optimizer has some tricks up its sleeves.

It can do these things called get range through convert and get range through mismatched types which apply specifically to searching for the date portion of date time values. Well, not totally specifically, but that’s mostly where you see it. So, when I run those two selects with the converts in there, I do get the same thing back, but this is a string and this is a date.

You understand the difference? This date… I mean this string represents the date, but it’s not a date to SQL Server anymore. It is a string now. This is a date just without the time portion that get date returns. To prove to you why one is indeed worse than the other, let’s run this first query where I’m going to select account from votes and I’m going to say where we convert the creation date column to a VARCAR10 and then compare that to where we convert this to other stuff, whatever.

It’s all the same. If I run this query and we wait, wait, wait, wait, wait, wait, wait, wait, we get zero rows back. And we look at the execution plan. Well, this took about five seconds to run. We had to scan the entire index. And when we scan that index, we applied this whole convert predicate thing.

We used our helpful nonclustered index. We did, but we had to read every single row in the table. We zoom in a little bit here. The number of rows are read is all just about 53 million. We do a big aggregation. We gather streams and we do another aggregation.

And finally we return zero rows and SQL Server complains to us about there being type conversions and implicit conversions and all sorts of other nonsense. So this is not an ideal situation. If we change this query a little bit to say convert creation date to a date and see where that equals casting get date as a date. And just about the same way, when we run this, it finishes instantly with a result of zero.

We’re able to seek into the index here and we’re able to rule out all those rows immediately. We can immediately figure out when we seek to this and to this. So it’s not a perfect seek. It’s an okay seek. It’s what I think Paul White called the dynamic seek.

But if we look at this, we can see that SQL Server did indeed seek and rule out rows immediately. We see that we didn’t read much of anything from this query. To show you just how bad this query is, let’s equalize things a little bit.

I call parallelism SQL servers great equalizer. This plan runs serially. If you recall running this plan, this plan ran in parallel. This plan was only even only took five seconds because SQL Server chose to engage parallelism and use multiple cores to process it.

Right. So five seconds going parallel. How long will this thing take if we force it to be competitive with our other query? If we say, hey, you know what? Our fast query ran at max.1 and finished instantly.

How long are you going to take if you run at max.1? How long SQL Server? How long? Just how long will this go on? 13 seconds.

You can look at the execution plan and we see nearly the same thing where the scan of this nonclustered index takes all 13 seconds. And we spend that whole 13 seconds applying this ridiculous string predicate converting an otherwise perfectly fine date time column to a string back to another string, comparing it to another string. It is absolutely ridiculous.

And I see this constantly when working with people. So the next time you’re working with dates, remember, they’re dates. They’re not strings, not floats, not ins, not money, not whatever else.

Leave them is to stop messing with them. Stop mangling with them. Let’s cut it out. It’s the year 2019.

I can prove that by by selecting get date. And we have had the date data type in SQL Server databases now for I don’t know how long. So there’s not that there ever was a great need to, but there is even less of a need to convert date times to strings to just get the date portion of them.

So anyway, I’m still to this very day, Erik Darling with Erik Darling data. Thanks for watching and I will see you and hopefully the next video. Goodbye.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.