SQL Server Archives | Page 149 of 164

Live SQL Server Q&A!

Posted on June 13, 2019May 14, 2022 by Erik Darling

ICYMI

Last week’s thrilling, stunning, flawless episode of whatever-you-wanna-call-it.

Thanks for watching!

Going Further

If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

SQL Server Scalar UDF Inlining And Security Functions

Posted on June 12, 2019May 14, 2022 by Erik Darling

You’ve Got No Security

Despite being a bouncer for many years, I have no interest at all in security.

Users, logins, roles, grant, deny. Not for me. I’ve seen those posters, and they’re terrifying.

Gimme 3000 lines of dynamic SQL any day.

This is a slightly different take on yesterday’s post, which is also a common problem I see in queries today.

Someone wrote a function to figure out if a user is trusted, or has the right permissions, and sticks it in a predicate — it could be a join or where clause.

High Finance

Stack Overflow isn’t exactly a big four accounting firm, but for some reason big four accounting firms don’t make their databases public under Creative Commons licensing.

So uh. Here we are.

And here’s our query.

DECLARE @UserId INT = 22656, --2788872, 22656
		@SQL NVARCHAR(MAX) = N'';

SET @SQL = @SQL + N'
SELECT    p.Id,
          p.AcceptedAnswerId,
          p.AnswerCount,
          p.CommentCount,
          p.CreationDate,
          p.FavoriteCount,
          p.LastActivityDate,
          p.OwnerUserId,
          p.Score,
          p.ViewCount,
          v.BountyAmount,
          c.Score
FROM      dbo.Posts AS p
LEFT JOIN dbo.Votes AS v
    ON p.Id = v.PostId
	AND dbo.isTrusted(@iUserId) = 1
LEFT JOIN dbo.Comments AS c
    ON p.Id = c.PostId
WHERE     p.PostTypeId = 5;
';

EXEC sys.sp_executesql @SQL,
                       N'@iUserId INT',
					   @iUserId = @UserId;

There’s a function in that join to the Votes table. This is what it looks like.

CREATE OR ALTER FUNCTION dbo.isTrusted ( @UserId INT )
RETURNS BIT
WITH SCHEMABINDING, RETURNS NULL ON NULL INPUT
AS
    BEGIN
        DECLARE @Bitty BIT;

        SELECT @Bitty = CASE WHEN u.Reputation >= 10000 
		                     THEN 1 
							 ELSE 0 
					    END
        FROM   dbo.Users AS u
        WHERE  u.Id = @UserId;
        
		RETURN @Bitty;

    END;
GO

Bankrupt

There’s not a lot of importance in the indexes, query plans, or reads.

What’s great about this is that you don’t need to do a lot of analysis — we can look purely at runtimes.

It also doesn’t matter if we run the query for a trusted (22656) or untrusted (2788872) user.

In compat level 140, the runtimes look like this:

ALTER DATABASE StackOverflow2013 SET COMPATIBILITY_LEVEL = 140;

 SQL Server Execution Times:
   CPU time = 7219 ms,  elapsed time = 9925 ms.

 SQL Server Execution Times:
   CPU time = 7234 ms,  elapsed time = 9903 ms.

In compat level 150, the runtimes look like this:

ALTER DATABASE StackOverflow2013 SET COMPATIBILITY_LEVEL = 150;

 SQL Server Execution Times:
   CPU time = 2734 ms,  elapsed time = 781 ms.

 SQL Server Execution Times:
   CPU time = 188 ms,  elapsed time = 142 ms.

In both runs, the trusted user is first, and the untrusted user is second.

Sure, the trusted user query ran half a second longer, but that’s because it actually had to produce data in the join.

One important thing to note is that the query was able to take advantage of parallelism when it should have (CPU time is higher than elapsed time).

In older versions (or even lower compat levels), scalar valued functions would inhibit parallelism. Now they don’t when they’re inlined.

Thanks for reading!

Going Further

Common Parameter Sniffing In Entity Framework Queries For SQL Server

Posted on June 11, 2019May 14, 2022 by Erik Darling

Still Not A Developer

I’m going to use a funny example to show you something weird that I see often in EF queries.

I’m not going to use EF to do it, because I have no idea how to. Please use your vast imagination.

In this case, I’m going to figure out if a user is trusted, and only if they are will I show them certain information.

Here goes!

Trust Bust

The first part of the query establishes if the user is trusted or not.

I think this is silly because no one should ever trust users.

DECLARE @UserId INT = 22656, --2788872
        @PostId INT = 11227809,
		@IsTrusted BIT = 0,
		@SQL NVARCHAR(MAX) = N'';

SELECT @IsTrusted = CASE WHEN u.Reputation >= 10000 
                         THEN 1 
						 ELSE 0 
				    END
FROM   dbo.Users AS u
WHERE  u.Id = @UserId;

The second part will query and join a few tables, but one of the joins (to the Votes table) will only run if a user is trusted.

SET @SQL = @SQL + N'
SELECT p.Title, p.Score,
       c.Text, c.Score,
	   v.*
FROM dbo.Posts AS p
LEFT JOIN dbo.Comments AS c
    ON p.Id = c.PostId
LEFT JOIN dbo.Votes AS v
    ON p.Id = v.PostId
    AND 1 = @iIsTrusted
WHERE p.Id = @iPostId
AND   p.PostTypeId = 1;
';

EXEC sys.sp_executesql @SQL,
                       N'@iIsTrusted BIT, @iPostId INT',
					   @iIsTrusted = @IsTrusted,
					   @iPostId = @PostId;

See where 1 = @iIsTrusted? That determines if the join runs at all.

Needless to say, adding an entire join in to the query might slow things down if we’re not prepared.

First I’m going to run it for user 2788872, who isn’t trusted.

This query finishes rather quickly (2 seconds), and has an interesting operator in it.

SQL Server Query Plan Tool Tip — Pump the brakes

The filter has a startup expression in it, which means it’s sort of a gatekeeper, here. If the parameter is 0, we don’t touch Votes.

If it’s 1… Boy, do we touch Votes. This is another case of where cached plans can lie to us.

Rep Up

If we run this for user 22656 (Jon Skeet) afterwards, we will definitely need to touch the Votes table.

I grabbed the Live Query Plan to show you just how little progress it makes over 5 minutes.

The cached plan will look identical. And looking at the plan, it’ll be hard to believe there’s any way it could run >5 minutes.

If we clear the cache and run this for 22656 first, the plan runs relatively quickly, and looks a little different.

Running it for an untrusted user has a similar runtime. It’s not great, but it’s the better of the two.

Fixing It?

It’s difficult to control EF queries with much granularity.

You could branch the application code to run two different queries based on if a user is trusted.

In a perfect world, you’d never even consider that join at all, and avoid having to worry about it.

On the plus side (at least in this case), the good plan for trusted users runs in the same time as the good plan for untrusted users, even though they’re different.

If you’re feeling extra confident, you can try adding an OPTIMIZE FOR hint to your code, or implementing a plan guide.

Thanks for reading!

Going Further

Just Using sp_executesql Doesn’t Make Dynamic SQL Safe From SQL Injection

Posted on June 10, 2019May 14, 2022 by Erik Darling

Safe Belt

A lot of people I’ve talked to about dynamic SQL have been under the misguided impression that just using sp_executesql will fix safety issues with SQL injection.

In reality, it’s only half the battle. The other half is learning how to act sober.

The gripes I hear about fully fixing dynamic SQL are:

The syntax is hard to remember (setting up and calling parameters)
It might lead to parameter sniffing issues

I can sympathize with both. Trading one problem for another problem generally isn’t something people get excited about.

Trading all the money in your company bank account to ransom your database probably isn’t something you’d get excited about either.

That’s not a very good lead on your rezoomay.

Holic

Here’s a trivial example:

CREATE TABLE dbo.DropMe(id INT);

DECLARE @DatabaseName sysname = N'';
SET @DatabaseName = N'S%'';DROP TABLE dbo.DropMe;--';

DECLARE @sql NVARCHAR(MAX) = N'
SELECT *
FROM sys.databases AS d
WHERE d.name LIKE ''%' + @DatabaseName + '%'';
';

PRINT @sql;
EXEC sys.sp_executesql @sql;

This will not only return a list of database names that contain S on my instance, but the printed SQL statement shows the whole string is executed.

SELECT *
FROM sys.databases AS d
WHERE d.name LIKE '%S%';DROP TABLE dbo.DropMe;--%';

Blue Flowers

The only way to not have that happen is to do this, and this is where people start complaining about remembering syntax:

CREATE TABLE dbo.DropMe(id INT);

DECLARE @DatabaseName sysname = N'';
SET @DatabaseName = N'S%'';DROP TABLE dbo.DropMe;--';

DECLARE @sql NVARCHAR(MAX) = N'
SELECT *
FROM sys.databases AS d
WHERE d.name LIKE ''%@iDatabaseName%'';
';

PRINT @sql;
EXEC sys.sp_executesql @sql, 
                       N'@iDatabaseName sysname', 
					   @iDatabaseName = @DatabaseName;

What prints out is this:

SELECT *
FROM sys.databases AS d
WHERE d.name LIKE '%@iDatabaseName%';

There’s also no search result returned, because no database is currently named ‘S%”;DROP TABLE dbo.DropMe;–‘.

But I get why people think this is annoying, because it is quirky at first.

If the string you use to encapsulate your parameters isn’t NVARCHAR, and/OR prefixed with N, you’ll get an error.
If you put your dynamic SQL variables on the wrong side of the equal sign, you’ll get an error.
And yes, if you’ve got skewed data, you’ll be more open to parameter sniffing.

The syntax stuff just takes a little getting used to, and performance stuff is often easier to fix than lost, stolen, or vandalized data.

Even if you’re real comfy with your backups, you’re still at risk of someone stealing confidential data.

Data Is A Liability

It’s really important that you review the personal data you collect to make sure it’s totally necessary.

It’s also really important for you to regularly archive data that you don’t actively need in your database.

For everything else, taking precautions like fixing unsafe dynamic SQL is just part of mitigating your data liabilities.

Thanks for reading!

Going Further

SQL Server Stuff I Still Haven’t Learned

Posted on June 7, 2019May 14, 2022 by Erik Darling

The More Things Change

There’s often a lot of shouting about this brand new thing that you simply must learn, lest ye perish in the unholy flames of obsoletion.

Having worked with SQL Server for a while now, I’ve heard it about a great many things.

I’m going to be honest: Those great many things have never called to me.

There’s no need to list them all out. It’d take me longer than I’d like, anyway.

But I’ll tell you something funny: I’ve never opened SSIS, SSAS, or SSRS.

Don’t even know how to. I’m quite happy other people have found their passion with them.

It’s just not me.

Catching Flies

The very specific thing that calls to me is performance tuning, by way of understanding the query optimizer.

That’s quite enough to keep me busy. There’s a ton to learn, and the deeper you get, the more you find.

I’m not going to tell you to learn it, or that if you don’t learn it you’ll be playing bucket drums in a train station.

Either it calls to you, or it doesn’t.

If a new thing arrives that changes that space, you can be damn sure I’ll be all over it.

Even if it’s a dud, I have to know why it’s a dud.

I’m looking right at you, Hekaton.

Threat Of A Good Time

I know, I know. Performance tuning will someday be a thing of the past.

The database will be quantumly self-tuning, self-healing, and whatever other things a database does by itself alone in the dark.

(I’ll give you a hint: it’s not index maintenance.)

I’m comfortable with that eventuality, even if I don’t think the people making those claims are totally in touch with reality about the timeline.

It will likely depend on the lengths to which software will be allowed to make fundamental changes to a database.

I’d be happy if the optimizer would explore UNION/UNION ALL optimizations for OR predicates more often, but hey.

We Out Here

In about five years of consulting — I don’t have an exact count — I’ve looked at probably a thousand servers.

I’m totally willing to concede that I didn’t talk to the right people to make a big judgement call here, but I don’t see a lot of people out there using the things that I’ve been told I must learn.

Granted, I see a lot of people with performance problems, because that’s what I’ve chosen to specialize in.

I might not be seeing people with the problems that other things solve: I fully acknowledge my own myopia here.

With the exception of Availability Groups (I maybe see them in the 5% range), all those Next Big Things™ don’t seem to pop up at all.

They haven’t changed what I do, or more importantly what I love to do one iota.

Thanks for reading!

Going Further

Live SQL Server Q&A!

Posted on June 6, 2019May 14, 2022 by Erik Darling

ICYMI

Last week’s thrilling, stunning, flawless episode of whatever-you-wanna-call-it.

Going Further

The Waiting Life Of A SQL Server Query

Posted on June 5, 2019May 14, 2022 by Erik Darling

Smoking Ain’t Allowed In School

While working on my new training, part of what I want to show people is how to match what happens in a query plan to the type of wait stats they’ll see on their servers.

Take this query plan for example:

There are THINGS GOING ON HERE! Scans, parallelism, sorting, spilling.

What do the waits look like for this query?

Spy vs Spy

It’s fairly easy to focus in on what a session is doing using sys.dm_exec_session_wait_stats, though there are some caveats about when the numbers reset.

On my laptop, I can just open a new SSMS tab without worrying about connection pooling or any weird app stuff.

When I do, these are the waits I see while the query runs.

Every single one of those waits is accounted for by something that happened in the query plan (or its attributes).

Breakdown

Our pages weren’t in memory, so we read them from disk (PAGEIOLATCH_SH), we had to allocate buffer pool memory (MEMORY_ALLOCATION_EXT), we used parallelism (both CX waits), we needed memory to sort data (RESERVED_MEMORY_ALLOCATION_EXT), we didn’t get enough so we spilled (IO_COMPLETION), and we returned data to SSMS (ASYNC_NETWORK_IO).

LATCH_EX can be a lot of things. In a parallel plan, it can be associated with exchange buffers and the parallel page supplier.

The goal here is for you to both look at a server’s wait stats and come up with a mental image of the types of queries you’re going to see, and be able to tie query plan operators and attributes back to wait stats.

Thanks for reading!

SQL Server Joins With OR Clauses = ?

Posted on June 4, 2019May 14, 2022 by Erik Darling

A Long Time Ago

I had to write some hand-off training about query tuning when I was starting a new job.

As part of the training, I had to explain why writing “complicated logic” could lead to poor plan choices.

So I did what anyone would do: I found a picture of a pirate, named him Captain Or, and told the story of how he got Oared to death for giving confusing ORders.

This is something that I unfortunately still see people doing quite a bit, and then throwing their hands up as queries run forever.

I’m going to show you a simple example of when this can go wrong, and also beg and plead for the optimizer team to do something about it.

Big Bully

“Write the query in the simplest way possible”, they said.

So we did, and we got this.

SELECT u.Id, MAX(p.Score)
FROM   dbo.Users AS u
JOIN   dbo.Posts AS p
    ON u.Id = p.OwnerUserId
    OR u.Id = p.LastEditorUserId
WHERE  p.PostTypeId IN (1, 2)
GROUP BY u.Id;

Note the OR in the join condition — we can match on either of those columns.

Here’s the index we created to make this SUPERFAST.

CREATE NONCLUSTERED INDEX 36chambers
    ON dbo.Posts ( OwnerUserId, LastEditorUserId, PostTypeId )
    INCLUDE ( Score );

If we’re good DBAs, still doing as we’re told, we’ll read the query plan from right to left.

The first section we’re greeted with is this:

We spend a full minute organizing and ordering data. If you want to poke around, the plan XML is here.

The columns in the Compute Scalars are OwnerUserId and LastEditorUserId.

Next in the plan is this fresh hell:

24 seconds seeking into the Users table and joining that to the results of the Constant Scans, etc.

What’s a little confusing here is that the scan on the Posts table occurs on the outer side of Nested Loops.

It’s also responsible for feeding rows through the Constant Scans. That’s their data source.

Overall, this query takes 1:36 seconds to run.

My gripe with it is that it’s possible to rewrite this query in an obvious way to fix the problem.

Magick

Using a second join to Posts clears things up quite a bit.

SELECT u.Id, MAX(p.Score)
FROM   dbo.Users AS u
JOIN   dbo.Posts AS p
    ON u.Id = p.OwnerUserId
	AND p.PostTypeId IN (1, 2)
JOIN   dbo.Posts AS p2
    ON u.Id = p2.LastEditorUserId
	AND p2.PostTypeId IN (1, 2)  
GROUP BY u.Id;

I know, it probably sounds counterintuitive to touch a table twice.

Someone will scream that we’re doing more reads.

Someone else will faint at all the extra code we wrote.

But when we run this query, it finishes in 10 seconds.

This plan does something a bit different. It joins the nonclustered index we have on Posts to itself.

The optimizer has a rule that makes this possible, called Index Intersection.

Extra Magick

A more accurate description of what I’d want the optimizer to consider here would be the plan we get when we rewrite the query like this.

SELECT u.Id, MAX(p.Score)
FROM   dbo.Users AS u
CROSS APPLY
    (
        SELECT p.Score
        FROM   dbo.Posts AS p
        WHERE u.Id = p.OwnerUserId
        AND p.PostTypeId IN (1, 2)
        
        UNION ALL 
        
        SELECT p2.Score
        FROM  dbo.Posts AS p2
        WHERE u.Id = p2.LastEditorUserId
        AND p2.PostTypeId IN (1, 2)  
    ) AS p
GROUP BY u.Id;

This query runs a bit faster than the second one (around 7 seconds), and the plan is a little different.

Rather than a Hash Join between the index on the Posts table, we have a Concatenation operator.

The rest of the plan looks like this:

The optimizer has a rule that can produce this plan, too, called Index Union.

Problemagick

The thing is, these rules seem to be favored more with WHERE clauses than with JOINs.

CREATE INDEX ix_fangoria
    ON dbo.Posts(ClosedDate);

SELECT COUNT_BIG(*) AS records
FROM dbo.Posts AS p
WHERE p.ClosedDate IS NULL
OR p.ClosedDate >= '20170101'
AND 1 = (SELECT 1);

CREATE INDEX ix_somethingsomething 
    ON dbo.Posts(PostTypeId);

CREATE INDEX ix_wangchung 
    ON dbo.Posts(AcceptedAnswerId);

SELECT COUNT_BIG(*) AS records
FROM dbo.Posts AS p
WHERE p.PostTypeId = 1
AND p.AcceptedAnswerId = 0
AND 1 = (SELECT 1);

Knackered

It is possible to get these kind of plans with joins, but not without join hints and a couple indexes.

CREATE INDEX aussie 
    ON dbo.Posts (OwnerUserId, PostTypeId, Score);

CREATE INDEX kiwi 
    ON dbo.Posts (LastEditorUserId, PostTypeId, Score);

SELECT u.Id, MAX(p.Score)
FROM   dbo.Users AS u
JOIN   dbo.Posts AS p
    WITH (FORCESEEK)
    ON u.Id = p.OwnerUserId
    OR u.Id = p.LastEditorUserId
WHERE  p.PostTypeId IN (1, 2)
GROUP BY u.Id;

There’s more background from, of course, Paul White, here and here.

Even with Paul White ~*~Magick~*~, the hinted query runs for ~16 seconds.

If you remember, the Index Intersection plan ran for around 10 seconds, and the Index Union plan ran for around 7 seconds.

This plan uses Index Union:

Thanks for reading!

Going Further

SQL Server T-SQL Protip: watch those TOPs without ORDER BY

Posted on June 4, 2019May 14, 2022 by Joe Obbish

In the documentation for TOP, the following is listed as a best practice:

In a SELECT statement, always use an ORDER BY clause with the TOP clause. Because, it’s the only way to predictably indicate which rows are affected by TOP.

Let’s work through a real world example.

The good

One of the great things about the “Top Resource Consuming Queries” query store SSMS report is that it is always able to render the query plan, even for very complex queries. I’m not aware of a pure T-SQL solution that can avoid requiring the end user to save xml to files in all cases. The report nearly always takes a long time to run, so it’s easy to capture the T-SQL that powers the grid details version:

DECLARE @results_row_count INT = 100,
@interval_start_time DATETIMEOFFSET = '2019-05-24 15:30:00 +00:00',
@interval_end_time DATETIMEOFFSET = '2019-05-24 18:00:00 +00:00';
 
SELECT TOP (@results_row_count)
    p.query_id query_id,
    q.object_id object_id,
    ISNULL(OBJECT_NAME(q.object_id),'') object_name,
    qt.query_sql_text query_sql_text,
    ROUND(CONVERT(float, SUM(rs.avg_duration*rs.count_executions))*0.001,2) total_duration,
    SUM(rs.count_executions) count_executions,
    COUNT(distinct p.plan_id) num_plans
FROM sys.query_store_runtime_stats rs
    JOIN sys.query_store_plan p ON p.plan_id = rs.plan_id
    JOIN sys.query_store_query q ON q.query_id = p.query_id
    JOIN sys.query_store_query_text qt ON q.query_text_id = qt.query_text_id
WHERE NOT (rs.first_execution_time > @interval_end_time OR rs.last_execution_time < @interval_start_time)
GROUP BY p.query_id, qt.query_sql_text, q.object_id
HAVING COUNT(distinct p.plan_id) >= 1
ORDER BY total_duration DESC;

Note the presence of the ORDER BY. I get exactly the results that I was expecting:

The bad

If I ask for extra details (who doesn’t want more details?), a significantly more complex query is generated:

-- grid format query with additional details
-- grid format query with additional details
DECLARE @results_row_count INT = 100,
@interval_start_time DATETIMEOFFSET = '2019-05-24 15:30:00 +00:00',
@interval_end_time DATETIMEOFFSET = '2019-05-24 18:00:00 +00:00';
 
With wait_stats AS
(
SELECT
    ws.plan_id plan_id,
    ws.execution_type,
    ROUND(CONVERT(float, SUM(ws.total_query_wait_time_ms)/SUM(ws.total_query_wait_time_ms/ws.avg_query_wait_time_ms))*1,2) avg_query_wait_time,
    ROUND(CONVERT(float, SQRT( SUM(ws.stdev_query_wait_time_ms*ws.stdev_query_wait_time_ms*(ws.total_query_wait_time_ms/ws.avg_query_wait_time_ms))/SUM(ws.total_query_wait_time_ms/ws.avg_query_wait_time_ms)))*1,2) stdev_query_wait_time,
    CAST(ROUND(SUM(ws.total_query_wait_time_ms/ws.avg_query_wait_time_ms),0) AS BIGINT) count_executions,
    MAX(itvl.end_time) last_execution_time,
    MIN(itvl.start_time) first_execution_time
FROM sys.query_store_wait_stats ws
    JOIN sys.query_store_runtime_stats_interval itvl ON itvl.runtime_stats_interval_id = ws.runtime_stats_interval_id
WHERE NOT (itvl.start_time > @interval_end_time OR itvl.end_time < @interval_start_time)
GROUP BY ws.plan_id, ws.runtime_stats_interval_id, ws.execution_type ),
top_wait_stats AS
(
SELECT TOP (@results_row_count)
    p.query_id query_id,
    q.object_id object_id,
    ISNULL(OBJECT_NAME(q.object_id),'') object_name,
    qt.query_sql_text query_sql_text,
    ROUND(CONVERT(float, SUM(ws.avg_query_wait_time*ws.count_executions))*1,2) total_query_wait_time,
    SUM(ws.count_executions) count_executions,
    COUNT(distinct p.plan_id) num_plans
FROM wait_stats ws
    JOIN sys.query_store_plan p ON p.plan_id = ws.plan_id
    JOIN sys.query_store_query q ON q.query_id = p.query_id
    JOIN sys.query_store_query_text qt ON q.query_text_id = qt.query_text_id
WHERE NOT (ws.first_execution_time > @interval_end_time OR ws.last_execution_time < @interval_start_time)
GROUP BY p.query_id, qt.query_sql_text, q.object_id
),
top_other_stats AS
(
SELECT TOP (@results_row_count)
    p.query_id query_id,
    q.object_id object_id,
    ISNULL(OBJECT_NAME(q.object_id),'') object_name,
    qt.query_sql_text query_sql_text,
    ROUND(CONVERT(float, SUM(rs.avg_duration*rs.count_executions))*0.001,2) total_duration,
    ROUND(CONVERT(float, SUM(rs.avg_cpu_time*rs.count_executions))*0.001,2) total_cpu_time,
    ROUND(CONVERT(float, SUM(rs.avg_logical_io_reads*rs.count_executions))*8,2) total_logical_io_reads,
    ROUND(CONVERT(float, SUM(rs.avg_logical_io_writes*rs.count_executions))*8,2) total_logical_io_writes,
    ROUND(CONVERT(float, SUM(rs.avg_physical_io_reads*rs.count_executions))*8,2) total_physical_io_reads,
    ROUND(CONVERT(float, SUM(rs.avg_clr_time*rs.count_executions))*0.001,2) total_clr_time,
    ROUND(CONVERT(float, SUM(rs.avg_dop*rs.count_executions))*1,0) total_dop,
    ROUND(CONVERT(float, SUM(rs.avg_query_max_used_memory*rs.count_executions))*8,2) total_query_max_used_memory,
    ROUND(CONVERT(float, SUM(rs.avg_rowcount*rs.count_executions))*1,0) total_rowcount,
    ROUND(CONVERT(float, SUM(rs.avg_log_bytes_used*rs.count_executions))*0.0009765625,2) total_log_bytes_used,
    ROUND(CONVERT(float, SUM(rs.avg_tempdb_space_used*rs.count_executions))*8,2) total_tempdb_space_used,
    SUM(rs.count_executions) count_executions,
    COUNT(distinct p.plan_id) num_plans
FROM sys.query_store_runtime_stats rs
    JOIN sys.query_store_plan p ON p.plan_id = rs.plan_id
    JOIN sys.query_store_query q ON q.query_id = p.query_id
    JOIN sys.query_store_query_text qt ON q.query_text_id = qt.query_text_id
WHERE NOT (rs.first_execution_time > @interval_end_time OR rs.last_execution_time < @interval_start_time)
GROUP BY p.query_id, qt.query_sql_text, q.object_id
)
SELECT TOP (@results_row_count)
    A.query_id query_id,
    A.object_id object_id,
    A.object_name object_name,
    A.query_sql_text query_sql_text,
    A.total_duration total_duration,
    A.total_cpu_time total_cpu_time,
    A.total_logical_io_reads total_logical_io_reads,
    A.total_logical_io_writes total_logical_io_writes,
    A.total_physical_io_reads total_physical_io_reads,
    A.total_clr_time total_clr_time,
    A.total_dop total_dop,
    A.total_query_max_used_memory total_query_max_used_memory,
    A.total_rowcount total_rowcount,
    A.total_log_bytes_used total_log_bytes_used,
    A.total_tempdb_space_used total_tempdb_space_used,
    ISNULL(B.total_query_wait_time,0) total_query_wait_time,
    A.count_executions count_executions,
    A.num_plans num_plans
FROM top_other_stats A LEFT JOIN top_wait_stats B on A.query_id = B.query_id and A.query_sql_text = B.query_sql_text and A.object_id = B.object_id
WHERE A.num_plans >= 1
ORDER BY total_duration DESC
)

Now we have not 1, not 2, but THREE TOP operators! But only one of them has an ORDER BY. The results are completely different, and are pretty much useless:

The ugly

This has nothing to do with TOP as far as I know, but I included it just for fun:

Final thoughts

All of you developers out there should watch your TOPs and make sure you’re using ORDER BY as needed. Otherwise, you might end up with annoyed end users writing blog posts about your code.

Thanks for reading!

Going Further

Independent Consulting: Six Months And Running

Posted on June 3, 2019May 14, 2022 by Erik Darling

Knock Knock

There’s nothing technical in this post.

It’s just sort of a general accounting of what six months of independent consulting has looked like for me.

Some things I’ve learned, things I’m happy with, things I’m unhappy with. I dunno.

We’ll see where this goes as I’m writing it. Don’t count on it being terribly insightful.

There are countless people who are owed great amounts of gratitude for the advice, encouragement, referrals, and all the other Spackle needed to get things off the ground.

Words in a blog post are insufficient. I’m plotting my thank yous carefully.

Caliente

I’ve had 23 clients in 6 months, not counting work that Mike Walsh has outsourced to me (insert largest possible thank you here).

The projects have varied quite a bit in length and price, so don’t go trying to reverse engineer my income.

I’m not as comfortable as some other people in talking about that, though I’m entirely willing to tell you I made $300 in January.

The only thing I’ll say is that I made just about my base salary last year at the beginning of May.

I’m very happy with that, but the thing I want to share is this: lump slums are really misleading.

I learned this from Brent.

An equally large thank you should be here, too, for probably the best education one could ask for in how to run a consulting business.

Bill Factor 100

Whenever money comes in, I pull 30% out to pay taxes with.

I stick it in a savings account, so maybe it’ll earn whatever pittance interest rates are these days.

Just to use a round number: If I charge someone $1000, $300 leaves immediately.

When you’re setting prices on your time, factoring in that 30% is really important.

The other things you need to consider are, obviously, your expenses.

I have rent, I have to fully pay for my own insurance now, and I have the ever-present work-harder motivation that is a wife and two kids.

Plus the usual phone, internet, utilities, and other recurring expenses.

To put $1000 in perspective: if I charge someone $1000, I still owe my insurance company $800.

To put it further in perspective, I’d still owe my landlord $2000.

And I owe Uncle Sam™ that $300. Whew.

If you ever wonder why consultants seem expensive, it’s because we have to factor a lot of stuff in.

Every price tag has that stuff (plus the constant fear of a slow month) built in to it.

Happy Happy

So what am I happy with? I have a thing that’s mine. I can do whatever I want with it.

I’ve found it fun building my own “brand”, even if I’m not terribly good at building a brand.

I’m happy with the work I’ve been doing, too. With Brent, it was very much hands off analysis.

I’ve been getting back into actually tuning stuff.

And you know what? It’s rewarding to know that the advice you give people actually works.

It’s not all fun and games, but what is?

At least I choose what I say yes or no to, so I only have myself to blame.

I’ve gotten to travel a bit more, though the locales haven’t exactly been exotic.

Downers

Stuff I’m unhappy with is largely cosmetic.

I hate the way my website looks. It’s dark, and it doesn’t sell anything very well.

All of the choices I made at the very beginning were very much hair-on-fire, oh-crap-I-gotta-make-this-work-quick.

Believe it or not, layoffs don’t come with a lot of advance notice.

Anyway, I feel like it shows when I look at my website. And I just don’t have time to do anything about it right now.

I’m also not great at marketing.

I mean, I’m pretty good at saying funny things about what I’m doing, but I have no idea how to get it in front of more faces, or get more people to say “yes”.

That psychology escapes me.

I’ve read books. I’ve seen doctors. I’ve talked to people. I just don’t think I have that gene.

Time Management

The one thing I’m really beating myself up about is training.

Writing it from scratch is hard.

And long.

Long and hard.

Like a morning without drinking.

It’s coming along, but all the “keep the lights on” work seems to eat away at the time I want to spend building it.

I’M A BIT OF A PERFECTIONIST (ha ha ha)

No but really. I have a lot to write down.

I’m plotting stuff that can take you from beginner to wherever I’m at currently. Hopefully that line keeps moving.

Futuristic

Alright, yeah. This thing.

My hope is that I can focus on getting training written and produced, even if it’s slower than I’d like.

I realized it would be daft to work on a set of my own data analysis scripts, so I’ll be working on the First Responder Kit as time allows.

When I’m comfortable with the amount of money in the bank, I’ll probably get the website a facelift.

Maybe I’ll magically get good at marketing.

Thanks for reading!