A SQL Server Parameterization Puzzle With TOP: Part 1

The Gift That Keeps On Giving


I love when a demo written for one purpose turns into an even better demo for another purpose.

While working with a client recently, they ran into a performance issue when trying to promote plan reuse by parameterizing the user-input number for TOP.

In part 1, I’m going to show you what happened and why, and in part 2 I’ll discuss some workarounds.

Regresso Chicken Face Soup


When executed with a literal value in the top, this query runs for around 10 seconds.

I’m not saying that’s great, but it’s a good enough starting place.

SELECT TOP (38)
        u.DisplayName,
        b.Name
FROM dbo.Users u
CROSS APPLY 
(
    SELECT TOP (1) 
            b.Name
    FROM dbo.Badges AS b
    WHERE b.UserId = u.Id
    ORDER BY b.Date DESC
) AS b
WHERE u.Reputation >= 10000
ORDER BY u.Reputation DESC;
SQL Server Query Plan
glamping

If we take that same query, put it in a procedure, and run it with an identical value in the TOP, things will turn out not-so-well.

CREATE OR ALTER PROCEDURE dbo.SniffedTop (@Top INT)
AS
BEGIN
SET NOCOUNT, XACT_ABORT ON;

    SELECT TOP (@Top)
            u.DisplayName,
            b.Name
    FROM dbo.Users u
    CROSS APPLY 
    (
        SELECT TOP (1) 
                b.Name
        FROM dbo.Badges AS b
        WHERE b.UserId = u.Id
        ORDER BY b.Date DESC
    ) AS b
    WHERE u.Reputation >= 10000
    ORDER BY u.Reputation DESC;

END
GO 

EXEC dbo.SniffedTop @Top = 38;

The query runs for a significantly longer amount of time.

SQL Server Query Plan
half-day

What Happened?


Unofficially, when TOP uses a constant and the constant is “small” (under 101), the exchange packets are allowed to send earlier than normal, as long as the exchange is below the TOP operator. They’re allowed to send as soon as they have a row, rather than waiting for them to fill up completely.

This can only happen with constants (or…!), and the behavior is true going back to 2005. It may change in the future, so if you’re reading this at some far off date, please don’t be too harsh on me in the comments.

When you parameterize TOP, it’s considered unsafe to send the exchange buffers early. After all, you could stick anything in there, up through the BIGINT max. In cases where you’ve got a BIG TOP, sending, say, 9,223,372,036,854,775,807 rows one at a time would be significantly ickier than sending over a smaller number of full exchange buffers.

If you’re surprised to hear that parallel exchange buffers can send at different times, you’re not alone. I was also surprised.

SQL Server: Full Of Surprises. Horrible surprises.

In the second query, where exchange buffers are sent when full, we spend a lot longer waiting for them to fill up. This isn’t exposed anywhere in the plan, and you’d need either a debugger or this blog post to figure it out.

Yep.

Yep. Yep. Yep. Yep. Yep. Yep.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

A T-SQL Query To Get The Text Between Two Delimiters In SQL Server

I swear It’s Not String Splitting


I have to do this a fair amount, and I always go look at this little cheat sheet that I have.

Then it occurred to me that it might be worth sharing the details here, in case anyone else runs into the same need.

The way I learned to do it is with SUBSTRING and CHARINDEX, which is a pretty common method.

Why CHARINDEX? Because it accepts an optional 3rd parameter that PATINDEX doesn’t, where you can give it a starting position to search. That comes in really handy! Let’s look at how.

The first thing we need for our test case is the starting point, which I’ve arbitrarily chosen as a colon in error messages.

SELECT
    m.*,
    parsed_string = 
        SUBSTRING
        (
            m.text, /*First argument*/ 
            CHARINDEX(':', m.text), /*Second argument*/ 
            LEN(m.text)/*Third argument*/
        )
FROM sys.messages AS m
WHERE m.language_id = 1033
AND   m.text LIKE N'%:%:%';

But the results aren’t exactly what we want! We can still see all the colons.

colonics

What we really need is to cut out the semi-colon, which means we need to move one character over.

Keep in mind that this will change based on the length of the string you’re searching for. If you were looking for ‘Erik’, you’d need to move over four positions.

Sometimes it’s easier to remember that by calling LEN on the search string.

SELECT 
    m.*,
    SUBSTRING
    (
        m.text,
        CHARINDEX(':', m.text) + LEN(':'),
        LEN(m.text)
    ) AS parsed_string
FROM sys.messages AS m
WHERE m.language_id = 1033
AND   m.text LIKE N'%:%:%';

That’ll get us to where we need to be for the first position! Now we need to get the text up to the second colon, which is where things get a little more complicated.

Right now, we’re just getting everything through the end of the error message, using LEN as the 3rd argument to SUBTSTRING.

To start with, let’s establish how we can use the third argument in CHARINDEX.

SELECT 
    m.*,
    SUBSTRING
    (
        m.text, 
        CHARINDEX
        (
            ':', m.text
        ) + LEN(':'),
        CHARINDEX
        (
            ':', 
            m.text, /*!*/
            CHARINDEX
            (
                ':', 
                m.text
            ) + LEN(':')/*!*/
        )
    ) AS parsed_string
FROM sys.messages AS m
WHERE m.language_id = 1033
AND   m.text LIKE N'%:%:%';

The start of the third argument is going to look nearly identical to the first one, except we’re going to start our search in the string after the first colon.

The code in between the exclamation points is the same as the second argument to SUBSTRING.

That’s because it gives us the correct starting position to start looking for the second colon from.

But, you know, this still doesn’t get us what we need, exactly. We need to chop some characters off.

How many?

1, 2, skip a few

I’ll save you some delicate CPU brain cycles: we need to subtract the length of the search string, and then subtract the number of positions in that the first search hit was.

SELECT 
    m.*,
    SUBSTRING
    (
        m.text, /*First argument*/
        CHARINDEX /*Begin Second argument*/
        (
            ':', 
            m.text
        ) + LEN(':'), /*End Second argument*/
        CHARINDEX /*Begin Third argument*/
        (
            ':', /*CHARINDEX of the first : after...*/
            m.text, 
            CHARINDEX /*The CHARINDEX of the first : in the string...*/
            (
                ':', 
                m.text
            ) + LEN(':')
        )
        - LEN(':') /*Minus 1, effectively*/
        - CHARINDEX(':', m.text) /*Minus the CHARINDEX of the first : in the string*/
        /*End Third argument*/
    ) AS parsed_string
FROM sys.messages AS m
WHERE m.language_id = 1033
AND   m.text LIKE N'%:%:%';

Which finally gives us what we’re asking for:

all smiles

Now, this may not make you totally happy. After all, there are still leading and trailing spaces on each line.

If you want to get rid of those, you can either adjust your LEN functions, or you can call TRIM, LTRIM/RTRIM on the final result. I don’t care, really.

They’re your strings. They shouldn’t even be in the database.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

Geek Talk #5 on Query Tuning by Erik Darling

Live Fish


I recently had the pleasure of appearing on Geek Talks! You can view the episode here:

And also on the SQL Server Geeks website.

Thanks for watching!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

Reasons Your Query Is Slow Right Now: Changes To SQL Server Settings

You And What Change Management Board?


I know, it sounds foolish. Trivial, even. Don’t let people make changes without permission.

But the number of times I’ve been reviewing server hardware and settings with people, and they’ve started head scratching about how something ended up the way it did is fairly astounding.

There’s almost no way to audit every since change, either.

Fun, right? SQL is fun. That’s why everyone acts like it’s terrible. We wanna keep the fun to ourselves.

Sticker Shock


There are a lot of things that can change under the covers.

One fairly sneaky thing is if someone makes sp_configure changes without running configure, that kick in after a reboot.

Unexpected failovers, patching cycles that require reboots, and all sorts of other bump-in-the-night incidents can lead to this.

On top of that, there are just so many settings that can cause erratic performance, it’d be disingenuous of me to list them all here, even though it’d probably be great for SEO.

Instead, have a scroll through this list. See what I mean?

You’re gonna have a hard time remembering to keep track of what all your settings are now, and if they’re different in the middle of a performance storm. I don’t usually tell people to run sp_Blitz when that’s going on, but it can be a good sanity check, too.

Not all performance problems are caused by the robots. Some are caused by the humans, too.

For a general check, I’ll run it like this:

sp_Blitz @CheckServerInfo = 1;

You might be able to get to the root cause of things pretty quickly.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

Reasons Your Query Is Slow Right Now: Unchecked SQL Server Maintenance

Chocolate Attack


I get it. You’re a good DBA.

No, you’re a great DBA.

You have alerts for when jobs fail, when they run long, and they’re all emailed to the ENTIRE COMPANY in beautifully formatted CSS.

Have a cigar.

But we all go on vacation sometime.

Overtime


One thing that can make a workload feel artificially sluggish is maintenance.

  • Index stuff: Lotsa work, sometimes blocking
  • Backup stuff: Native fulls of VLDBs, with compression
  • DBCC CHECKDB: Oh, you forgot to run this because you were busy rebuilding indexes

Darn. Gimme that cigar back.

Right now, I’ve got a DBCC CHECK, and a query workload, that both run for one minute and 20 seconds when they run alone.

Quick Queries
Doublefast DBCC CHECKDB

Overlap


The problem becomes when I run them both at the same time. The query workload runs for about 25 seconds longer, and CHECKDB runs for two and a half minutes.

That’s like, almost twice as long.

Ugly Queries
Dusty DBCC CHECKDB

Now, it’s okay if you don’t have all those alerts set up. I guess.

Because you can use sp_BlitzFirst to see what’s going on, and get warned:

EXEC sp_BlitzFirst @Seconds = 10, @ExpertMode = 1;
Out of time

If you can end maintenance tasks and have performance go back to normal-bad, congratulations! You’ve solved today’s problem.

Have a cigar.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

Reasons Your Query Is Slow Right Now: Blocking, Blocking, Blocking

Lockzilla


Blocking sucks. SQL Server should have used an optimistic isolation level by default.

Sure, writers would still block each other, but think of all the time people could have saved not explaining NOLOCK hints.

  • Yes, they still take locks
  • Yes, it’s the same as READ UNCOMMITTED
  • Yes, sometimes they make queries faster without blocking

Wait, what?

Moving Right Along


There are some interesting things to say about blocking, I suppose, aside from that it sucks.

For instance, it can look really weird when there’s parallelism. And read queries can block write queries.

There, I’ve said interesting things about blocking. It still sucks.

But how do you diagnose it?

First, close Activity Monitor.

Doctor, Doctor


My favorite tool for diagnosing blocking is sp_WhoIsActive, and I’m going to show you three different ways to look at it.

The safest way

EXEC sp_WhoIsActive @get_task_info = 2, @get_additional_info = 1;

This isn’t the way most people do it, but it is the least impactful.

You get back the normal set of results:

sp_WhoIsActive
peachy

Cool, you can see queries, lock waits, and blocking sessions. But a short scroll to the right also brings you to this additional_info clickable column:

sp_WhoIsActive
oily water

It’s not so helpful for the query doing the blocking, but it’ll tell you what the queries being blocked are stuck on.

The usual way

EXEC sp_WhoIsActive @get_locks = 1;

This is the way I normally demo looking for blocking with it, because it is more useful to see what the blocking query is doing.

But I’ve also had it be slow. Really, really slow.

That can happen when there is JUST SO MUCH BLOCKING that it takes a long time to enumerate all of it.

But you’ll get back this lovely sort of detail:

sp_WhoIsActive
say it loud

Page locks. 102. Great.

The smart way

EXEC sp_WhoIsActive @find_block_leaders = 1, @sort_order = '[blocked_session_count] DESC';

It’s not always obvious by runtime which session is causing blocking, so you can use this command to find blocking chains, and order output by who’s doing the most blocking.

sp_WhoIsActive
troubleshot

Fixins


When you need to troubleshoot live blocking, these are the ways I usually check in on things with sp_WhoIsActive. It’s hard to beat.

Of course, long term, a monitoring tool, or logging the blocked process report to a table can help you when you’re not there hitting F5 live and in-person.

We’ve peeled back a couple scenarios where oddball things can slow a server down. Tomorrow we’ll look at a new one!

What’ll it be?

Your guess is not as good as mine.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

Reasons Your Query Is Slow Right Now: Your SQL Server Is Overloaded

Oh Baby Another Series


Don’t you love lists? I love lists. They’re so nice and tidy, like a RegEx. Just the other night I was thinking how nice it’d be to write a RegeEx to detect RegEx patterns.

I didn’t sleep.

Anyway, SQL Server. Things. Things running slowly. What’s up with that?

In these posts, I’ll walk through some common scenarios, and how to look for them.

It might even be in the EXACT SAME VERY ORDER I LOOK FOR THEM!

Put your money away.

Reason 1: The Server Is Overloaded


If you go to open a new tab in SSMS and you hear a creaking sound, it might be time to revisit some of your hardware choices.

But that’s a problem for future you. Your problem is right now.

  • How to check it: sp_PressureDetector
  • What to look for: Stuck queries, Resource Usage, Wait Stats

By default, it’ll look at both CPU and Memory counters. If you don’t know what’s going on, just hit F5.

EXEC dbo.sp_PressureDetector;

If you’re hitting memory limits, things will look like this:


SQL Server Query Results
are you ready?

Some queries will have requested memory, but it won’t have been granted.

SQL Server Query Results
payday

Waiting queries will be stuck in a queue, waiting on RESOURCE_SEMAPHORE.

SQL Server Query Results
we’re having a good time

A significant amount of memory will have been granted out, and available memory will be on the lower side. You’ll also probably see the waiter_count column

If you’re hitting CPU limits, things will look like this:


SQL Server Query Results
negative creep

Available threads might be a negative, or very low number. Requests may be waiting on threads, and the number will pretty closely match the number of rows that are…

SQL Server Query Results
bum

Waiting in the pool.

SQL Server Query Results
she works hard for the money

This’ll also show you queries that are running, and which ones are using the most threads.

Can You Fix It Quickly?


Maybe, maybe not. If you’re allowed to kill off queries, you might be able to right the ship now. If not, you’re stuck waiting for queries to finish and give back their resources.

Longer term, you have a long checklist to go through, including asking tough questions about hardware, settings memory and parallelism settings, and your queries and indexes. This script will give you a lot of information about what’s going on. It’s up to you to figure out why.

If you need help with this sort of thing, drop me a line.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

Why Microsoft Should Get Rid Of SQL Server’s Plan Cache And Use Query Store Instead

Used To Love Her


I used to think the plan cache was so cool.

  • You can find queries that aren’t good there
  • Plans are full of details (and XML)
  • Supporting DMVs give you extra insights about resource usage

But most of the time now, I’m totally frustrated with it.

It clears out a lot, plans aren’t there for some queries, and the plans that are there can be very misleading.

Can you really tell someone what their worst performing queries are when everything in there is from the last 15 minutes?

No.

Why I Like Query Store Better


Plans end up there for more definite amounts of time, even if you stick a recompile hint on them.

Plus, you get a whole lot more information about what went on with all the different query metrics. As far as a source of “what happened and when”, it’s much more reliable. It’s almost like having a monitoring tool in SQL Server.

Do you trust your plan cache to tell you what was slow more than a couple days ago?

One really big bonus is that it’s portable, and it’s per-database. You can backup or clone a database so someone else can easily look at it somewhere else.

On top of that, it’s more usable: there are built in reports, a graphical interface, settings you can tweak, and plans you can force with a couple clicks.

You could solve a lot of frustrations by moving things here.

Microsoft Has Work To Do, Though


There have been a number of issues, and unanswered user requests for Query Store over the years, and not all of the associated DMVs have full parity.

Querying the data also needs to be made less prone to performance issues. And, living in the future, it’s a much more sensible location for the last known actual plan.

A nice to have would be predicate normalization for identical and morally equivalent queries, with totals and averages for resource usage, so you could group things together. Especially for morally equivalent queries, this would allow plan forcing where the queries get different ids.

Some monitoring tools have this in their interface, and the number of times I’ve been able to track down outliers within a group of executions to solve a problem is, well, at least memorable to me.

I wish more people had monitoring tools.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

Ways To Participate In The Data Community Instead Of Speaking

It’s Hard Out There For A Pup


When you’re first getting started speaking, you might find it difficult to get picked, or find events to speak at where the schedule works for you.

Event organizers, or people who vote to select sessions might not recognize your name yet. If you’re presenting about a topic in a crowded field, it can be difficult to get noticed unless you gin up awesome session titles and abstracts.

Some groups meet during the day when you might have work obligations you can’t step away from, and some might happen at night where you have family obligations you can’t step away from.

Or, like, you might still be working.

Who knows, but I hope not.

If you want to participate in the community in ways that can get you some name recognition, or fit in your schedule (or both!), here are some ways to do it.

Answer Me


Here are some good Q&A sites:

Answering questions on these is great experience, because there’s such a wide variety of topics. Heck, even just reading Q&A can be valuable.

There are some downsides, though: Many questions are poorly asked. They’re not bad questions, but there’s either so little information or so many local factors at play, it can be frustrating.

You gotta learn early to walk away from those.

Painted Word


If you’re the strong, silent type, you can take your presentation and turn it into a series of blogs. Not everyone enjoys interacting publicly like that, and that’s a very valid way to feel.

I know it sounds crazy — why would someone come to your session if the material is available online, at their convenience?

Because watching someone do something brings life to it. Do you like reading your favorite movie script as much as you like watching your favorite movie?

I don’t.

The other nice thing is that you can write really detailed blog posts, which takes some of the strain off having every ounce of minutiae in your session. You can point people to your posts once you’ve covered the most important parts of your topic.

There’s also a lot of value in writing your material out in long form. You’ll notice all sorts of extra things you want to explore, and you have the space to do it.

Be Kind, Rewind


If you have the setup to do it, because for some weird reason you had to start working from home full time, it can be fun to record your session.

It forces you to deliver the whole thing, say it out loud, and figure out how you want to say things.

I’m gonna be honest with you: the way you write things down isn’t gonna be the same way you say them out loud.

If you’re happy with it, put it on YouTube.

Why These Things?


Remember up at the top of the post where I talked about name recognition? All of these things can help establish that.

And look, I’m not saying you have to do any of these things. You’re under no obligation. But if you want to get into speaking, these are all activities that can help you get started, and make you better at it.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

The Cheapskate’s Guide To Accessing Data In SQL Server

Locality


Every action has some latency attached to it. Depending on how far you have to go, it could be a lot or a little.

And of course, it also depends on some situations you might run into along the way.

That’s one reason why batch mode can introduce such a performance improvement: CPU instructions are run on batches of rows at a time local to the CPU, rather than a single row at a time. Less fetching is generally a good thing. Remember all those things they told you about cursors and RBAR?

For years, I’ve been sending people over to this post by Jeff Atwood, which does a pretty good job of describing those things.

But you know, I need something a little more specific to SQL Server, and using a slightly different metric: We’re gonna assign $Query Bucks$ to those latencies.

After all, time is money.

Gold Standard


Using a similar formula to Jeff’s, let’s look at how expensive it gets once you cross from memory to disk.

mucho dinero

See the end there, when you jump from nanoseconds to microseconds? At those prices, you start to understand why people like me tell you to solve problems with more memory instead of faster disks. Those numbers are for local storage, of course, and main memory is still leaving Road Runner clouds around it.

If you’re on a SAN — and I don’t mean that SAN disks are slower; they’re not — you have something else to think about.

What I mean is the component in between that can be a real problem: The N in SAN. The Network.

just a little bit

If you add latency in just the milliseconds, costs pretty quickly jump up into numbers that’d make athletes blush. And if you’ve ever seen those 15 second I/O warnings in the error log…

This is where a lot of people under-provision themselves into nightmare. 10Gb Ethernet connections can move data fairly quickly, at around 1.2 GB/s. Which is great for data that’s easily accounted for in 100s of MB. It’s less great for much bigger data, and it’s worse when there’s a lot of other traffic on the same network.

Sensitivity


Competition for these resources, which is really common for database workloads that often have multiple queries all reading and writing data simultaneously, can take what would be an otherwise fine SAN and make it look like a tarpit.

You have to be really careful about these things, and how you choose to address them when you’re dealing with SQL Server.

Standard Edition is in particularly rough shape, with the buffer pool being a laughable 128GB. In order to keep things tidy, your indexes really need to be spot on, so you don’t have unnecessary things ending up there.

The more critical a workload is, the more you have riding on getting things right, which often means getting these numbers as low as possible.

Hardware that’s meant to help businesses consolidate isn’t always set up (or designed) to put performance first. Once you start attaching prices to those decisions that show how much time they can cost your workloads is a good way to start making better decisions.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.