SQL Server Archives | Page 134 of 164

SQL Server Cursors Are Often Misunderstood

Posted on January 8, 2020May 14, 2022 by Erik Darling

Honk Honk

People often trash cursors even when they’re used for perfectly fine reasons.

I understand that this reaction may be because they’ve seen cursors misused in the past. Sometimes because they heard someone popular say it.

In either case, everything has a time and place, and there are many times when cursors aren’t the performance sucks people chalk them up to be.

Reasonable Uses

Places where cursors don’t freak me out:

Maintenance scripts (backup, checkdb, etc.)
Building dynamic strings
Batching modifications
Passing per-thing parameters to a stored procedure

It might shock you to find cursors in well-respected pieces of code, like sp_WhoIsActive. But if you crack open the procedure and search for “cursor”, you’ll find six of them that do different things. Do you still hate cursors?

What if I showed you Paul White his-very-self suggesting people use them?

Not to mention other luminaries and nobodies who have found reason to call upon the cursed cursors.

So What Then?

Should you start out most code by writing a cursor? Absolutely not.

Should you convert every cursor to a while loop? Ehhhhh.

Should you understand when you should or shouldn’t use a cursor? Absolutely.

Some people have had pretty good careers talking about knee-jerk reactions, and I think seeing a cursor declared illicits many knee jerk reactions.

Read the code. Understand the requirements.

I tune queries all day long. The number of times someone has said THIS CURSOR IS A REAL BIG PROBLEM and been right is pretty small.

Often, there was a tweak to the cursor options, or a tweak to the query the cursor was calling (or the indexes available to it) that made things run in a more immediate fashion. I want to tune queries, not wrestle with logic that no one understands. Old code is full of that.

The number of times I’ve seen someone tell me they made something faster with totally broken logic and incorrect results is pretty high.

Thanks for reading!

Going Further

If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

How Table Variables Prevent SQL Server From Using A Parallel Query Plan

Posted on January 7, 2020May 16, 2022 by Erik Darling

Well, huh

Going Further

Using Column Store Indexes To Improve Unpredictable User Search Queries

Posted on January 6, 2020May 16, 2022 by Erik Darling

And Cough

Going Further

Where Bitmaps Dare In SQL Server Query Plans

Posted on January 3, 2020May 14, 2022 by Erik Darling

I AIN’T NO

Going Further

Eager Index Spools From Nonclustered Indexes In SQL Server

Posted on January 2, 2020May 14, 2022 by Erik Darling

Dangarang

Going Further

You’re In No Shape To Be Reading

Posted on January 1, 2020May 16, 2022 by Erik Darling

Vitamin Advil

Dates Aren’t Strings

Posted on December 31, 2019May 14, 2022 by Erik Darling

Seriously.

Going Further

Computed Column Follies In SQL Server

Posted on December 30, 2019May 14, 2022 by Erik Darling

Darnit

While helping a client out with a performance problem recently, I ran into something kind of funny when creating a computed column.

They were experiencing performance problems because of a join involving a substring.

Weird, right? Like, if I tried to show you this in a presentation, you’d chase me out of the room.

But since they were nice enough to hire me, I went about fixing the problem.

Computer Magic

The “obvious” — and I apologize if this isn’t obvious to you, dear reader — was to add a computed column to work around the issue.

Adding a computed column gives you the expression that you’re generating on the fly and trying to join on. Because manipulating column data while you’re joining or filtering on it is generally a bad idea. Sometimes you can get away with it.

But here’s something that messed me up, a uh… seasoned database professional.

The query was doing something like this (not exactly, but it’s good enough to get us moving):

SELECT COUNT(*) AS records
FROM dbo.Users AS u
WHERE SUBSTRING(u.DisplayName, 1, LEN(u.DisplayName) - 4)
 = SUBSTRING(u.DisplayName, 1, LEN(u.DisplayName) - 4);

Matching strings from the beginning to the end minus four characters.

I wanted to look smart, so I did this:

ALTER TABLE dbo.Users 
    ADD DisplayNameComputed
	    AS SUBSTRING(DisplayName, 1, LEN(DisplayName) - 4);

I didn’t want to persist it right away — that can lock the table and take longer — and because I knew I was going to index it.

The problem is that when I tried to index it:

CREATE INDEX dummy 
    ON dbo.Users(DisplayNameComputed);

I got this error:

Msg 537, Level 16, State 3, Line 21
Invalid length parameter passed to the LEFT or SUBSTRING function.

And when I tried to select data from the table, the same error.

In the real query, there was a predicate that avoided columns with too few characters, but it was impossible to apply that filter to the index.

There’s also other restrictions on filtered index where clauses, like you can’t like LIKE ‘____%’, or LEN(col) > 4, etc.

Case Of Mace

Having done a lot of string splitting in my life, I should have been more defensive in my initial computed column definition.

What I ended up using was this:

ALTER TABLE dbo.Users 
    ADD DisplayNameComputed
	    AS SUBSTRING(DisplayName, 1, LEN(DisplayName) 
		- CASE WHEN LEN(DisplayName) < 4 THEN LEN(DisplayName) ELSE 4 END);

A bit more verbose, but it allowed me to create my computed column, select from the table, and create my index.

AND THEY ALL LIVED HAPPILY EVER AFTER

Just kidding, there was still a lot of work to do.

Thanks for reading!

Going Further

Implicit Conversion Is A SARGability Problem, That’s Why It’s A Performance Problem In SQL Server

Posted on December 27, 2019May 16, 2022 by Erik Darling

Concerns

If you compare the things that non-SARGable queries cause issues with alongside the things that bad implicit conversions cause issues with, it’s an identical list.

Increased CPU
Inefficient use of indexes
Poor cardinality estimation
Maybe a bad memory grant based on that
Some “row by row” event

Though we often bucket the problems separately, they’re really the same thing.

That’s because, under the covers, something similar happens.

If you replace “CONVERT_IMPLICIT” with any other function, like ISNULL, COALESCE, DATEADD, DATEDIFF, etc. you may see the same performance degradation.

Probably not the most thought provoking thing you’ve ever heard, but if you understand why one is bad and not the other, this may help you.

Thanks for reading!

Going Further

Column Selectivity And SQL Server Index Design Patterns

Posted on December 26, 2019May 14, 2022 by Erik Darling

The Memes On The Blogs Fall Mainly On The Aughs

Unless your column is unique, and defined as unique, and people are searching for equality predicates on it — and I don’t mean column = column — I mean column = value, it might not be a great first column in your index. Many unique columns I see are identity columns that don’t necessarily define a relationship or usable search values. They’re cool for keeping the clustered index sane, but no one’s looking at the values in them.

The problem with the advice that you should “always put the most selective column first” is that not many columns are uniformly selective. For some ranges, they may be selective, for other ranges, they may not be.

Let’s look at some examples.

Usery

Let’s look at some tables in the Stack Overflow data dump. I realize this isn’t a perfect data set, but it has a lot of things in common with data sets I see out in the world.

The site has gotten more popular over time, so year over year dates become less selective
The site has definite groups of “power users” and “one and done” users
Certain site activities are more common than others: votes cast, types of posts made
Certain user attributes, like badges, are more common than others

All of these patterns are generally observable in real world data, too. Growth is a near constant, and with growth is going to come lumpy patterns.

SQL Server Management Studio Query Results — Not like the others

Looking at significant number differences here, the top vote type (an upvote) has 37 million entries. The next most popular one has 3.7 million.

Are either of those selective? No.

But when you get down to the bottom, you reach some selectivity.

The dates the vote were cast become less selective over time, too.

Within User Reputations, things become skewed towards the bottom end.

Though the site gets more users overall, Reputation is still largely skewed towards power users.

What Does This Mean For You?

Don’t assume that just because you search for something with an equality that it’s the most selective predicate.
Don’t assume that any search will always be selective (unless the column is unique).
Don’t assume that the most selective predicate should always be the first column in an index; there are other query operations that should be considered as well

I can’t count the number of times that someone has told me something like “this query is fast, except when someone searches for X”, or “this query is fast, except when they ask for a year of data”, and the solution has been creating alternate indexes with key columns in a different order, or flipping current index key columns around.

Thanks for reading!