Update: Azure SQL DB And SQL Server 2022 Will Have These
Spinning Out Of Control
It’s sorta kinda pretty crazy when every major database platform has something implemented, and SQL Server doesn’t.
Geez, even MySQL.
But a fairly common need in databases is to find the max value from two columns.
Maybe even across two tables.
Track V
For one table, it’s fairly straight forward.
SELECT MAX(x.CombinedDate) AS greatest FROM dbo.Users AS u CROSS APPLY( VALUES( u.CreationDate ), ( u.LastAccessDate )) AS x( CombinedDate );
We’re using our old friend cross apply with a values clause to create on “virtual” column from two date columns.
As far as indexing goes, I couldn’t find any performance difference between these two. They both take about 1 second.
CREATE INDEX smoochies ON dbo.Users(CreationDate, LastAccessDate); CREATE INDEX woochies ON dbo.Users(LastAccessDate, CreationDate);
Indexing strategy will likely rely on other local factors, like any where clause filtering.
Monolith
A similar pattern will work across two tables:
SELECT MAX(x.Score) FROM dbo.Posts AS p JOIN dbo.Comments AS c ON p.Id = c.PostId CROSS APPLY( VALUES( p.Score ), ( c.Score )) AS x ( Score );
Though this is the worst possible way to write the query. It runs for around 10 seconds.
The indexes I have for this query look like so:
CREATE INDEX thicc ON dbo.Posts(Id, Score); CREATE INDEX milky ON dbo.Comments(PostId, Score);
Reversing the key column order helps — the query runs in about 3 seconds, but I need to force index usage.
Of course, this is still the second worst way to write this query.
The best way I’ve found to express this query looks like so:
SELECT MAX(x.Score) FROM ( SELECT MAX(p.Score) AS Score FROM dbo.Posts AS p ) AS p CROSS JOIN ( SELECT MAX(c.Score) AS Score FROM dbo.Comments AS c ) AS c CROSS APPLY( VALUES( p.Score ), ( c.Score )) AS x( Score );
The cross join here isn’t harmful because we only produce two rows.
And it finishes before we have time to move the mouse.
Likewise, the faster pattern for a single table looks like this:
SELECT MAX(x.Dates) FROM ( SELECT MAX(u.CreationDate) CreationDate FROM dbo.Users AS u ) AS uc CROSS JOIN ( SELECT MAX(u.LastAccessDate) LastAccessDate FROM dbo.Users AS u ) AS ul CROSS APPLY (VALUES (uc.CreationDate), (ul.LastAccessDate)) AS x (Dates);
Because we’re able to index for each MAX
CREATE INDEX smoochies ON dbo.Users(CreationDate); CREATE INDEX woochies ON dbo.Users(LastAccessDate);
Of course, not every query can be written like this, or indexed for perfectly, but it’s gruel for thought if you need specific queries like this to be as fast as possible.
Thanks for reading!
Going Further
If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.
Why don’t you use just UNION ?
Because it’s not always productive to add another aggregation into the mix.
Thanks!
i mean this one
SELECT MAX(x.Score)
FROM
(
SELECT MAX(p.Score) AS Score
FROM dbo.Posts AS p
) AS p
CROSS JOIN
(
SELECT MAX(c.Score) AS Score
FROM dbo.Comments AS c
) AS c
CROSS APPLY( VALUES( p.Score ), ( c.Score )) AS x( Score );
select max(score) from (
SELECT Score FROM dbo.Posts union all
SELECT Score FROM dbo.Comments ) as x
UNION would be a bad choice, since it does an implicit DISTINCT (-> slower). You could use UNION ALL (which you should always prefer, except you really have dublettes and need to remove them).
Furthermore UNION (even with ALL) is usually slower than a FROM (VALUES (x), (y), (z)), particularly, when you have more than three fields (VALUE() works with up to 1000, imaging the crazy execution plan with 1000 UNIONs)
a little bit other, but would work too:
select (SELECT MIN(tmp.val) FROM (VALUES (t.val1), (t.val2), (t.val3)) AS tmp(val)) as MinVal,
t.*
from dbo.tbl as t
Particularly, when you select many columns this may become faster, since it has not to douple or x-fold all the rows with all columns by the cross apply.
Interesting, I’ll give that a shot!
Please note that we implemented GREATEST/LEAST in Azure SQL DB and future versions of SQL Server:
https://docs.microsoft.com/en-us/sql/t-sql/functions/logical-functions-greatest-transact-sql?view=sql-server-ver15
Yeah! I saw the announcement, I just haven’t had a chance to update the post yet.
How are you finding the native solutions perform?
Thanks!