Okay, Look
Before I go on about how adding an index can fix your blocking and deadlocking problems, let me warn you about how indexes can cause you problems.
Without fail, whenever I suggest adding indexes to clients, someone gets real worried about write speeds.
Like, the server can be burning down, spurting blood everywhere, and cursing our mothers in ancient Aramaic, but no, let’s stop and worry about write speeds going up by a few milliseconds.
You have to add quite a few indexes (20), and be modifying quite a few rows (millions) before the number of indexes really starts to hurt “write speeds”.
I haven’t seen a server whose biggest problem was write speeds (outside of Managed Instance and SQLDB) since spinning disks were measured in RPMs. The real problem I see many servers face from having “too many” indexes is increased locking.
The more indexes you add, the more you have to lock when modifications happen, even when you’re changing the same number of rows as you were before. You also increase your changes for lock escalation attempts.
Having a handsome young professional with reasonable rates (me) review your indexing is a good idea.
But you can end up with locking and deadlocking problems when you err in the opposite direction of “too many” indexes, especially if your modification queries don’t have good supporting indexes to help them find the data they wanna change.
It Started With A Scan
I never spent a summer at camp Scans-Are-Bad, but scans can tell us something important about modification queries.
In a very general sense, if the operators in your modification queries are acquiring data from the tables they need to modify via a scan, they’ll start by locking pages. If they start by seeking to rows in the table they need to modify, they’ll start by locking rows.
Once lock escalation thresholds are hit, they may attempt to lock the whole table. If the optimizer thinks your query needs to do a lot of work, it may use a parallel plan, which may increase the likelihood of lock escalation attempts.
Let’s say we have this query:
UPDATE p SET p.Score += 1000 FROM dbo.Posts AS p JOIN dbo.Users AS u ON u.Id = p.OwnerUserId WHERE u.Reputation >= 800000 AND p.Score < 1000;
And this is the query plan for it:
We would care very much about the Posts table being scanned to acquire data, because the storage engine doesn’t have a very granular way to identify rows it’s going to modify.
We would care so very much, that we might add an index like this:
CREATE INDEX p ON dbo.Posts (OwnerUserId, Score) WITH (MAXDOP = 8, SORT_IN_TEMPDB = ON, DATA_COMPRESSION = PAGE);
To get an execution plan that looks like this:
Now, this isn’t awesome, because we’re updating the Score column, and we need the Score column in our index to find the rows we care about, but if the query goes from taking 5 seconds to taking a couple hundred milliseconds, we’re going in the right direction.
There is reasonably sound advice to avoid indexing columns that change a lot, but if you need to find rows to modify in columns that change a lot, you may need to follow the also quite reasonably sound advice to make your queries faster so there’s less chance for them to become entangled.
The longer modification queries run for, the longer they have to interfere with other queries.
It Started With A Tran(saction)
I often see transactions (ab)used by developers. Here’s a lazy example:
BEGIN TRAN; UPDATE b SET b.UserId = 138 FROM dbo.Badges AS b WHERE b.Date >= '2010-12-25' AND b.Date < '2010-12-26'; /*Go do some other stuff for a while, I hear ACID is good.*/ COMMIT; /*No error handling or anything, just screw it all.*/
The query plan for this update will look simple, harmless, never did nothin’ to nobody. But while we’re doing all that ~other stuff~ it’s holding onto locks.
This isn’t always the end of the world.
Right now, the only index is the clustered primary key. Some queries will be able to finish immediately, as long as they’re located in they’re in the clustered primary key prior to the rows that are being locked.
For example, this query can finish immediately because the date value for its row is a touch before the pages we’re updating.
SELECT b.* FROM dbo.Badges AS b WHERE b.Id = 1305713;
But this query, and any queries that search for an Id value on a locked page, will be blocked.
SELECT b.* FROM dbo.Badges AS b WHERE b.Id = 1306701
That Id value is just outside the range of dates we’re modifying, but because the storage engine is locking pages and not rows, it has to protect those pages with locks.
With this index, both queries would be able to finish immediately, because the storage engine would know precisely which rows to go after, and a more granular locking arrangement (rows instead of pages) would be available.
CREATE INDEX woah_mama ON dbo.Badges (Date) WITH (MAXDOP = 8, SORT_IN_TEMPDB = ON, DATA_COMPRESSION = PAGE);
In this example, we’re not updating the Date column, so we don’t have to worry about the sky falling on write speeds, or write speeds crying wolf, or a thousand other cautionary tales about overreaction.
NOLOCK, And Other Drugs
Let’s say we’re idiots unaware of how bad uncommitted reads are, and we decide to use NOLOCK hints to avoid our select queries being blocked.
SELECT b.* FROM dbo.Badges AS b WITH(NOLOCK) WHERE b.Id = 1306700;
While that transaction is open, and we’re still doing all those other highly ACIDic things to our database, this query will return the following results:
We can see the in-flight, uncommitted, change from the update.
Maybe we return this data to an end user, who goes and makes a really important, life-altering decision based on it.
And then maybe something in that transaction fails for some reason, and everything needs to roll back.
That really important, life-altering decision is now totally invalidated, and the end user’s life doom-spirals into sure oblivion.
And it’s all your fault.
Thanks for reading!
Going Further
If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that, and need to solve database performance problems quickly. You can also get a quick, low cost health check with no phone time required.