Announcing Free SQL Server Performance Monitoring That Doesn’t Suck

Announcing Free SQL Server Performance Monitoring


For years, enterprise SQL Server monitoring tools have been charging you an arm and a leg for stagnating products made by people who have never done an ounce of SQL Server performance tuning or DBA work. They phone home with your data, lock you into annual contracts, and still can’t tell you why your server is slow without you doing half the work yourself.

That is, if you survive the installation process. Active Directory. Domain accounts. Firewalls. Security policies. Additional database servers. Annoying.

Or just send everything to the cloud, somewhere.

I got real tired of dealing with that, so I built a free, open-source SQL Server performance monitoring tool that isn’t any of those things, and doesn’t suck.

You can find it on GitHub.

SQL Server Performance Monitor — Dashboard landing page with server health cardsFree SQL Server Performance Monitoring
NOC-out

Here’s what it does, and why you might want to check it out.

The Short Version


SQL Server Performance Monitor is a free tool that collects and visualizes performance data from your SQL Servers.

It comes in two editions:

  • Full Edition — installs a PerformanceMonitor database on each server with T-SQL collectors running via SQL Agent. A separate WPF Dashboard connects to view everything in that database, and that database only. It can’t see anything beyond that.
  • Lite Edition — a single desktop app that monitors SQL Servers remotely. Nothing gets installed on the target server. Data gets pulled into a local embedded DuckDB, with automatic Parquet archival.

At no point does your monitoring data go anywhere outside of your server or your computer, and the connection method is the same one SSMS uses.

Both editions include real-time alerts (system tray & email), pretty charts and graphs, and a built-in MCP server for LLM-powered analysis with tools like Claude.

Supported platforms:

  • SQL Server 2016, 2017, 2019, 2022, 2025
  • Azure SQL Managed Instance
  • AWS RDS for SQL Server
  • Azure SQL Database (Lite only)

Download on GitHub

Who Is This For?


Not everyone needs enterprise-scale monitoring. If any of these sound familiar, this tool is for you:

  • Solo DBAs and small teams: You’re responsible for a handful of SQL Servers. You need to know what’s happening, but you don’t need a platform that costs more than the servers it monitors and only tells you when PLE is low and how many KB of free space your disks have.
  • Consultants and contractors: Drop it on a client’s server, collect real data, diagnose their problems without having to spend time on data collection. The Lite Edition doesn’t even require installation on the target, just connect and go.
  • Developers who care: Your app is slow and you suspect the database (as everyone does). Point this at your server, run your workload, and see exactly which queries are hurting your server.

Two Editions, One Goal


Full Edition (Server-Installed)

Full Dashboard with wait stats, CPU trends, and blocking chainsFree SQL Server Performance Monitoring
Actual Colors May Vary

The Full Edition installs a PerformanceMonitor database directly on each SQL Server you want to monitor. T-SQL collector stored procedures run via SQL Agent, gathering everything from wait stats and query performance to blocking chains and deadlock graphs and more.

What you get:

  • 30+ specialized T-SQL collectors running every 5 minutes (you can control the schedule, and which ones run)
  • Data stored in SQL Server tables you control (query them directly in SSMS or whatever)
  • Community tools installed automatically (sp_WhoIsActive, sp_BlitzLock, sp_HealthParser)
  • NOC-style landing page with green/yellow/red health cards per server
  • Automatic data retention (30 days default, configurable per table)
  • Built-in MCP server with read-only tools for AI analysis (disabled by default)

Best for: Production servers where you want continuous 24/7 collection with maximum data fidelity and access.

Requirements: sa on the target server for database install, SQL Server Agent running. Collectors only need VIEW SERVER STATE or VIEW PERFORMANCE STATE.

Lite Edition (Standalone Desktop App)

Lite Edition main window with sidebar and query performance chartsFree SQL Server Performance Monitoring
Lite Brite

The Lite Edition is a single WPF application that monitors SQL Servers remotely. It queries DMVs directly over the network and stores data locally in embedded DuckDB with automatic Parquet archival. Nothing is installed on the target server.

What you get:

  • 20+ collectors with configurable schedules
  • Local DuckDB storage with ZSTD-compressed Parquet archives (~10x compression)
  • Multi-server monitoring from one seat
  • Azure SQL Database support (the only edition that supports it)
  • Single executable — extract, run, connect
  • Built-in MCP server with 31 read-only tools for AI analysis

Best for: Quick triage, Azure SQL Database, locked-down servers where you can’t install anything, consultants, firefighting.

Requirements: VIEW SERVER STATE on the target server (VIEW DATABASE STATE for Query Store), or just VIEW PERFORMANCE STATE where supported.

Side-by-Side Comparison


Feature Full Edition Lite Edition
Installation on SQL Server Required (database + jobs) Not required
Collectors 30+ (via SQL Agent) 20+ (in-app)
Data Storage SQL Server (on target) DuckDB + Parquet (local)
Azure SQL Database Not supported Supported
Alerts (tray + email) Blocking, deadlocks, CPU Blocking, deadlocks, CPU
Agent Job Monitoring Duration vs avg/p95 Duration vs avg/p95
MCP Server (AI) Yes Yes
Community Tools sp_WhoIsActive, sp_BlitzLock Not needed
Execution Plans Collected, downloadable Collected, downloadable
Portability Server-bound Single executable

Alerts That Actually Tell You Something


Alert email notification showing blocking details with query textFree SQL Server Performance Monitoring
You oughta know.

Both editions include a real-time alert engine that runs independently of which tab is visible, even when the app is minimized to the system tray.

What gets monitored:

  • Blocking: fires when the longest blocked session exceeds your threshold
  • Deadlocks: fires when new deadlocks are detected
  • High CPU: fires when total CPU exceeds your threshold
  • Connection changes: fires when a monitored server goes offline or comes back online

How you get notified:

  • System tray balloons with a 5-minute per-metric cooldown
  • SMTP email with styled HTML, query text, blocking chains, and deadlock graph XML attached for offline analysis
  • Resolved alerts: automatic “Cleared” notification when the condition ends

All thresholds are configurable. You can acknowledge and silence alerts per-server and per-tab with a right-click.

MCP Server: AI-Powered Analysis


Claude analyzing SQL Server performance data via the built-in MCP serverFree SQL Server Performance Monitoring
Thanks for the summaries

Both editions include a built-in Model Context Protocol (MCP) server. Enable it in Settings, register it with Claude Code or Cursor (but definitely not Co-Pilot).

The MCP server only works off of a set of read only tools, and can only look at your collected performance data.

They can’t see anything in any other databases, or execute custom SQL commands.

You can ask questions like:

  • “What are the top wait types on sql2022?”
  • “Are there any blocking or deadlock issues?”
  • “Show me CPU utilization for the last 4 hours”
  • “What are the most expensive queries by CPU?”
  • “Give me a full health check across all servers”

They cover wait stats, query performance, CPU, memory, blocking, deadlocks, I/O latency, TempDB, perfmon counters, and more.

Setup is one command:

claude mcp add --transport http --scope user sql-monitor http://localhost:5151/

Your Data, Your Tables


With the Full Edition, everything lands in SQL Server tables you own. No APIs. No export wizards. Open SSMS and write a query.

-- What were my top waits yesterday?
SELECT TOP (10)
    wait_type,
    waiting_tasks_count,
    wait_time_ms,
    avg_wait_ms
FROM PerformanceMonitor.collect.wait_stats
WHERE collection_time >= DATEADD(DAY, -1, GETDATE())
ORDER BY 
    wait_time_ms DESC;

Reporting views are included for common questions, but you’re not limited to them. Join the collection tables however you want.

Build Power BI reports?

Export to Excel?

Something with Fabric?

I don’t care.

It’s just tables.

What Gets Collected


The Full Edition runs 30+ collectors via SQL Agent. The Lite Edition runs 20 collectors directly in the app. Here’s what they cover:

Category What’s Collected
Query Performance Query stats from plan cache, procedure stats, Query Store metrics, active query snapshots, waiting tasks
Wait & Resource Stats Wait statistics (delta-based), latch statistics, spinlock statistics, CPU utilization, perfmon counters
Memory & I/O Memory grants, memory clerks, buffer pool pressure, file I/O latency, TempDB usage
Blocking & Events Blocked process reports, deadlock graphs, ring buffer events, default trace analysis, system health events

All collectors compute deltas automatically — you see actual work done between snapshots, not cumulative totals that reset on service restart.

Security: We Care A Lot


You’re giving a monitoring tool access to your SQL Servers (I hope, anyway). I did my best to make sure we’re not doing or allowing anything stupid.

What it does:

  • Stores passwords in Windows Credential Manager (DPAPI encryption), never in config files or plain text. Same as SSMS.
  • SQL connections default to mandatory TLS with certificate validation
  • Every SQL query uses parameterized queries, zero string concatenation
  • All monitoring queries run at READ UNCOMMITTED to avoid blocking your production workload
  • Every line of code is open source on GitHub for you to audit, make fun of, steal, fall in love with, etc.

What it doesn’t do:

  • No telemetry: zero analytics, tracking, or usage reporting
  • No phone-home: the app never contacts any external server
  • No cloud dependency: everything stays on your machines and your SQL Servers
  • No auto-updates: you choose when to update, and if you want to check for updates
  • No hidden network calls: the only outbound traffic is SMTP email alerts you configure yourself

Getting Started


Full Edition

  1. Download the installer from GitHub Releases
  2. Run the GUI or CLI Installer and enter your server connection details
  3. Click Install, it creates the PerformanceMonitor database and SQL Agent jobs
  4. Repeat for each server you want to monitor
  5. Launch the Dashboard, add your servers, and data appears immediately

Lite Edition

  1. Download and extract the Lite Edition ZIP from GitHub Releases
  2. Run PerformanceMonitorLite.exe
  3. Click “Add Server” and enter your connection details
  4. That’s it. Data collection starts within 1-5 minutes.

Why I Built This


I spend a lot of time looking at SQL Server performance problems. Clients call me when things are on fire, and the first thing I need is data. What’s waiting, what’s blocking, what’s eating CPU, which queries are the worst offenders, and a bunch of other contextual information to go along with it.

If they don’t have a monitoring tool, we’re at the mercy of aggregated metrics, restarts, and maybe just not looking at the right thing at exactly the right time. It’s also really hard to show improvements over time.

Most monitoring tools are either expensive enterprise platforms, or one-off scripts that don’t persist anything (and certainly don’t give you any pretty graphs to look at).

I wanted something in between: a real monitoring tool that collects meaningful data, shows it clearly, and doesn’t require a procurement process to install.

So I built one. And now you can use it too.

Support & Professional Services


This project is free and open source. Use it freely. No strings attached.

If you find it valuable, here are two optional ways to support continued development:

  • Sponsor on GitHub — fund new features, ongoing maintenance, and SQL Server version support
  • Consulting & Training — need help analyzing the data this tool collects? Want expert assistance fixing the issues it uncovers? That’s what I do.

Download SQL Server Performance Monitor

MIT License | .NET 8 | Source on GitHub

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

Get AI-Ready With Erik: A Little About Embeddings

Get AI-Ready With Erik: A Little About Embeddings


Video Summary

In this video, I dive into the world of embeddings and their importance in bridging the gap between what computers can do and what humans need them to do. I explain why traditional string comparisons fall short when it comes to understanding the meaning behind text, especially for tasks like optimizing SQL queries or improving database performance. By converting text into numerical vectors, embeddings allow us to compare texts based on their semantic similarity rather than just surface-level differences, making it possible for computers to understand and process natural language more effectively.

Full Transcript

Erik Darling here. Back in my AI groove. Today we’re going to talk a little bit about embeddings. Yesterday we talked a little bit about dimensions and how embeddings make up dimensions and stuff like that. So today we’re going to talk about why we have embeddings. What is an embedding and why do we need it? So that’s going to be our goal today is to just go over that a little bit. Now, the thing is that computers are great. at comparing things like numbers, right? Which is what embeddings are made up of all those little floaty things that we looked at. But computers are very, very bad at understanding meaning, right? And unless we get into all sorts of terrible, like, like wild card percent, like searching in our, our string columns, which we would like to not do, then like there is really no good way to compare text to see how similar that text is, right? We all we can do is to compare text to see how similar that text is, right? We all we can do is to compare text to see how similar that text is, right? So what we can do is say is this word in this text, or are these words, like is this pattern in this text, even regex can’t, can’t tell you what something means. Unless we don’t want that problem. So if we were to take the phrases, how do I optimize a SQL query? And what’s the best way to speed up my database queries, if we were to take those strings, the computer would not be able to really compare those strings in a meaningful way for like what their what their intent is, what they mean to us, right? All the computer can do is compare the little bits and bobs in each string and say, hey, do they match? Are they are they are they equal? So to a computer doing string comparison? You know, those are completely different strings, right? Those strings are not equal to each other. But to a human, they’re kind of asking the same question. Embeddings are to try and bridge that gap. Now, you might hate AI, right? You like I mean, I’m, you know, pretty, gosh, diggity darn sick of hearing about AI everything and having everyone shove an AI thing in their, their AI in their in their application or in their product and saying, Oh, look, I changed this page from loading to thinking and now I’ve got AI. But like, if like, as a DBA, you should be into this or like, even as a developer, you should be into this because like, you can avoid doing all that, like really painful, like wild card string search stuff in a lot of places.

So like, if we were to take these two strings, right? Again, how do I optimize a SQL database query? And what’s the best way to speed up my database queries? The computer can’t figure that out, right? Like it just says that these things are different, right? These are not the same string dummy, right? But they mean the same thing, right? They have the same general meaning. Even if you were to do like, you know, some awful like search, you know, with a bunch of wild cards in there, which SQL is awful at, or even if you get into like full text search, these strings are not at all equivalent, right? There’s no match, right?

It’s terrible. So what embeddings do is turn text into numbers. So how do I optimize a SQL query? Well, I mean, you know, just to jump ahead a little, if your embedding model generates 1024 dimensions, then you’re embedding, then that string will become 1024 numbers, right? It’ll be that sort of square, at least in SQL Server and how it shows you these things as a little square bracketed list of various floats and bibs and bobs. And what’s the best way to speed up database queries would also become 1024 numbers. And those 1024 numbers, because the way that the embedding models are all trained, and like the billions and billions of texts that they’ve seen, these two things generate numbers that are pretty close to each other, they’re not going to be exact.

But again, we’re not trying to find exact here. It’s not like if you were searching for an error code, like I’m hitting error 0x, you know, 80085. And you want an exact match for that error, because if you get like error 80086, it’s going to be different, right? It’s not the same error. So like, it’s not an exact match we’re looking for. We’re looking for things that are similar to each other.

So if we were to compare, like SQL Server, like, like the embedding model would take these two strings, turn them into numbers, and then those numbers would become how we figure out how similar they are. And again, the lower number you get using the cosine thing with vector distance, the better off you are. So this is a very, very similar one, because this number is getting pretty close to zero, right?

Like, so this would be a pretty good indication that these strings are similar. If like, you know, if this, if this five are way over here, or like, like, you know, we were like, you know, had like a one point something over here, that would not be very similar, because cosine goes from zero to two. Right? So we would not have a very similar match there.

Like, if we took these two strings, like, let’s say we had these two questions. We had, who is the governor of Campania during the Herculonius period? And how do I get pizza off my eyeglasses?

Completely different questions, right? Not even closer, like, like, we were in different worlds. So these would generate completely different embeddings. And these completely different embeddings would generate a cosine that is, you know, that is nearly two, that is nearly, that is almost as high as we can get, as distant as we can get with two, with two sort of text embedding vectors, right?

It’s very, very far apart in the world of cosines. But the whole thing is that each dimension is there to capture some aspect of meaning. What does, what does, what does this string of words mean?

What are we indicating here? What is our intent? Right? And at least nobody I know knows exactly what each dimension represents. That number, like, got me on that one, right?

Search me, baby. But I’m sure there’s, I’m sure someone out there is very, who is very, very smart and very, very good at math can tell you what each, they could probably read dimensions the way, you know, some people can read, you know. Like, like, assembly code.

Like, okay, good for you. You know, some people can read, like, binary. Like, oh, like, that’s, that’s a 17. And you’re like, okay.

Okay, sure it is. But I’m sure someone, someone out there who’s very smart could, you know, explain to you what each dimension represents. But all of these different AI models, right? All of these models learn patterns from billions and billions of text examples, right?

It could be, you know, copyrighted internet material. It could be books they stole. It could be your website, right?

It could be your blog posts. It could be, I don’t know, like, anything, right? Like, all these text examples, these models get trained on them. You know, it’s sort of like, like, OCR stuff where, you know, the, like, different programs will get trained on different sets of images so that they can recognize something from those images. Like, you know, birds or bowling balls or, you know, Adidas t-shirts.

So it’s sort of like the same thing, but like, but just with text. So it’s sort of like, like GPS coordinates for meaning, right? So like New York and Boston, they’re pretty close, right?

They’re nearby each other. New York and Tokyo, they’re, they’re pretty far from each other. I think that’s like almost opposite, exact opposite sides of the planet. So, you know, for like, if you were to take the, the, the, like SQL optimization and query tuning, you know, those are pretty close by, those are close by concepts, right?

They’re very, they’re very near. They are near neighbors to each other. But SQL optimization and like best pizza are pretty far.

Even though the best SQL optimization and the best pizza are in New York, right? That’s, that’s, that’s me. I’m not the best pizza, but.

Ah, whatever. Anyway. Ah, that was off. That was a little off the cuff. Oh, I apologize. I apologize. Ah, but those things are far apart, right? Like, you know, like if you, if you, if you’re trying to find like, you know, like query tuners in New York and you get a list of pizza places, well, you’re probably not going to be too happy with that search.

Anyway, it’s a little bit about embeddings. Um, hope you enjoyed yourselves. I hope you learned something.

And, uh, I will see you over in tomorrow’s video where, uh, we will talk about more stuff from the course. And again, this course is all currently, uh, on sale. You get a hundred bucks off with that coupon code.

This link will be down in the video description. Uh, but if you don’t feel like waiting from clicking through on things and you feel like there might be some malicious, uh, UTM codes or something in here, you can just go to training.erikdarling.com. So, grab that Get I Ready course and just use the coupon code AIREADY to get that hundred bucks off.

Anyway, thank you for watching. I’ll see you over in tomorrow’s video where we will talk about some other vector-y things. All right.

Thank you for watching.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

Get AI-Ready With Erik: A Little About Vectors and Distances

Get AI-Ready With Erik: A Little About Vectors and Distances


Video Summary

In this video, I dive into the exciting new vector data type and vector distance function introduced in Microsoft SQL Server 2025 and Azure SQL Database. Starting off with a brief introduction to the vector data type, I explain its syntax and default behavior, including how it handles dimensions and the importance of specifying the correct number of dimensions when working with vectors. I also highlight that while this is just a taste of what’s covered in my comprehensive course, purchasing it remains essential for those looking to fully leverage these new features. The video then delves into practical examples using SQL Server Management Studio 2025 and demonstrates how vector distance functions like cosine, Euclidean, and dot work, emphasizing their importance in similarity searches and text analysis tasks.

Full Transcript

Erik Darling here. Darling Data. And we’re going to spend some time, since I just released a course about all the new AI hubbub in Microsoft SQL Server 2025, Azure SQL Database and Managed Instance, I figured we should do some videos to kind of give you a little idea of what the content looks like in order to hopefully spurn you into purchasing it. There’s a coupon code up yonder that will get you $100 off the course. I’ll put this link in the video description for as many as I remember. But the coupon code is just AI ready. So if you go over to my training site at training.erikdarling.com, you can put in that coupon code without having this whole full link in front of you in order to get that course content. Anyway, let’s see. Let’s start off with just a little tiny itty bitty introduction to the vector data type and a little bit about the vector distance function. There’s of course a lot more material in the full course, but if I did the whole thing for free here, there’d be no point in purchasing it, wouldn’t there? Unless you just wanted to say thanks, which the very few of you want to do. Anyway, the new vector data type, by default, well, I mean, it looks like this. It’s like Victor with an E. You say, I want this vector data type. You tell SQL Server how many dimensions will be in that data type. And of course, dimensions are these things over here. Each dimension is a number. They’re all separated by commas, right? So 1.0, 2.0, 3.0, and they have to be within these little square brackets. By default, they will all be float 32. We’ll talk more about what that means. There is of course a float 16 that is in preview. Float 32 is the general availability data type that is fully supported currently.

And so that’s generally what we get. You generally won’t have three dimensions in a vector. It’s just for a little bit of simplicity here. But this is a little bit of, oh, this is kind of what it looks like. Let me use my fabulous new SQL Server Management Studio 2025 content zoom feature. And this is what you get back when you look at a vector in SSMS. It’s actually kind of a funny XML clickable column. And notice that these numbers look a little bit different from the numbers that we put in. These get converted to big crazy floats. But this is what the vector data type looks like.

Sort of. Well, I mean, it’s not what it looks like in storage. It’s what it looks like presented to you as a person. Sort of like on Star Trek when aliens are like so big and weird and scary that they have to wear devices that present themselves as like human looking to the ship members and the crew and stuff. It’s sort of like that. If you saw what vectors look like, if you saw their internal representation, your mind would collapse on itself.

All right. It’s like hearing the voice of God or something. One thing to be aware of, and this is something that’s going to scare some people when they maybe start working with these, is that vector data types are rather inflexible in many ways. You have to know exactly how many dimensions. And again, a dimension is each number inside here. You are going to have for your vector data type and you must use precisely that many.

For example, if we do this and we declare a vector with one dimension, we cannot assign three dimensions to it and just have it silently truncate. We’ll get an error. Nor can we sort of under pack a dimension and say like, you know, a lot of people will be like, I’m going to just use Envarcar max for this state field and stuff like, you know, M A N Y R I C A T X and stuff in it.

You can’t do that with vectors, right? So if we say I want a vector one oh two four and we try to just put three in it, SQL service like no can do. One oh two, four and three do not match. Sorry, pal. The demand, the number of dimensions that you use for your vector data, vector data types are going to be determined by whatever embedding model you choose to use to generate your embeddings.

That’s a big, scary, crazy sentence. But don’t worry. We’ll talk more about that is, you know, sort of another videos this month. But also I talk way more about that in the course material.

So if you are just so eager to know more about that, you can you can get way ahead and buy the course now. But different models. Right. So like, you know, like every time, you know, like Anthropic or ChatGPT or, you know, whatever Google is doing, say this is our newest, most powerful model yet. That like, you know, like, you know, they’re talking about like things like that.

Right. It’s like the model that you’re using that would generate the numbers that tell that, like assign meaning to the text data that you have. Right. Those models generate stuff and like in those numbers, those models generate a certain number of dimensions. There are some newer ones that are dynamic, but like that’s not going to help us here.

But like most like most like you choose, like the model that you choose. Right. Like is going to be dependent on a lot of stuff. You might not even be the one choosing it.

You just have to know how many dimensions that model creates when it generates embeddings. But once you pick a model, right, or once a model is chosen for you, all your vectors need to match that dimension. So this is called length and you cannot compare different ones.

So the like probably one of the more common ways that you’re going to be using vector data types in SQL Server, at least today, because the vector search function, which will again, we’ll get to later, is still in preview. It is not a generally available feature. So the most common way that most people in production are going to be doing things is by using the vector distance function.

That’s this thing right here where this is how this is how SQL Server will tell you how similar things are usually using the cosine calculation here. But if we have, let’s say up here, we have a vector three and a vector four, we cannot compare those. Right. Like SQL Server just says vector dimensions three and four do not match.

Right. So we can’t compare a vector with three dimensions to a vector with four dimensions or any other differing number of inventions or dimensions. Doesn’t matter. But the vector distance function has three metrics available.

We have cosine, Euclidean and dot. Cosine is what you’re going to be using most of the time because that’s a similarity search. Euclidean is almost like GPS coordinates.

It’s it’s a totally different thing. And dot is dot is just weird. I don’t I don’t even like talking about dot because it’s just that bizarre. But if we have these two vectors up here and we’re just going to again keeping things simple.

With just three dimensions and then we have one dot O two dot O three dot O and one dot O two dot O three dot five. What we what we get when we run these things are slight are just different ways to measure similarity. Right. So.

It’s different, but for similarity search. Right. For like searching text for like, you know, things that are considered close to each other. Cosine is going to be what you use mostly.

So let’s run this and compare. Let’s see how similar these two vectors are and the cosine distance for this one. Like generally lower number is better.

Right. So the lower a number is the quote, like the better off it is for the cosine distance. This is measuring like the similarity of like like all of the dimensions that we have up there. The Euclidean distance is 0.5.

And the only reason this is 0.5 is because this is 0.5 higher. So when I said this is like GPS coordinates or something, it really is. And this isn’t going to be really good for like text similarity search stuff.

And this is not what you want to use here. So just because that that dimension is like 0.5 somethings longer than the other dimension, it is 0. The difference is 0.5.

The dot product distance. This is negative 15.5. But the calculation for this is just like like 1 times 1 plus 2 times 2 plus 3 dot 3 times 3.5. Right. So we get 15 and a half back.

So that’s that’s really all that is. Anyway, when the like I said, the closer the the numbers that you get back are or rather the lower the numbers are for cosine is going to be the closer to 0. The the the the more similar the vectors are or the more similar the dimensions are.

So if we have these two strings here that match entirely. Right. So that’s now it’s 1.02 dot 3 dot 5. And we look what we get back here.

Now cosine is exactly 0. Euclidean is exactly 0. And the the dot product distance is negative 17.25. But again, this is just like, you know, like one.

One times one plus two times two plus 3.5 times 3.5. So like this number is just getting lower for whatever reason. I mean, we made the numbers bigger.

So the negative number got bigger, which is a weird thing to think about. But like I said, the lower numbers are the more similar they are. I use cosine throughout like the entire course because I use a Stack Overflow database.

And the sort of goal of using that and which we’ll talk about more in a later video is that like it’s probably more most closely matches what a lot of you will end up doing with it. Where, you know, we’re trying to find we’re using question titles to find similar questions. We’re using, you know, question titles and comparing them to answer bodies to find like maybe like, you know, good answers for a similar question title, stuff like that.

So like the cosine is really what you want to be using. But look what happens here when we change the lower string to use all negative numbers. Right. So negative one, negative two, negative three dot five.

Now, all of a sudden we get back very different numbers. Now, the cosine distance here is almost two, which is like the far end of this. No, it’s like for a cosine distance.

The numbers that you’ll get back are between zero and two. So like there’s not a lot of forgiveness in there. So like the closer you are to zero for cosine distance, the better.

Euclidean distance is now 7.8. And really, it’s just because we’re measuring like this thing is going positive and the other thing is going negative. So they’re just like, like again, GPS coordinates.

These are two different points on a map like these just got further apart on like whatever grid map weird tesseract these vectors exist in. And the dot product distance is now a positive 15.5 telling us that these are very, very different dots. Like I said, we don’t talk about dot product anyway.

Thank you for watching. I hope you enjoyed yourselves and I’ll see you in tomorrow’s video where we will talk about some, some more fun vectory stuff in SQL Server. Anyway, thank you for watching.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

SQL Server Performance Office Hours Episode 47

SQL Server Performance Office Hours Episode 47


I have a query that is timing out after 30 seconds and is mostly waiting on pagelatch_sh. It’s a clustered index seek for a small number of rows on a narrow table. The query doesn’t show up as part of a blocking chain. Any suggestions for next step in finding the problem? I don’t think it is last page insert contention because there are few inserts to this table
Did you ever use performance monitor for troubleshooing and, if so, what metrics did you use ?
I never want to use DBCC SHRINKFILE on a data file again. Do I lose anything by totally abandoning it?
Could you please send me the link of the SQL books
On a client call recently, you were pretty specific about avoiding usage of the OBJECT_ID() function. I noticed you used it in a demo recently, fine enough (it’s just a demo). Any other such functions we should avoid using in production?

To ask your questions, head over here.

Video Summary

In this video, I dive into answering five fascinating questions from my YouTube community, covering topics like performance troubleshooting and SQL Server best practices. I share insights on how to approach a query timing out due to page latch SH waits, discuss the limitations of using Performance Monitor for SQL Server diagnostics, and explain why avoiding certain functions like OBJECT_ID in production code can prevent concurrency issues. Additionally, I provide an update on my consulting and training offerings for 2026, including new pricing options and subscription models that cater to a broader range of needs. Whether you’re looking to improve your SQL skills or just want to support the channel, there’s something here for everyone. Stay tuned as we kick off January with more exciting content focused on AI in SQL Server!

Full Transcript

Erik Darling here with Darling Data. Back from my big five days off of pumping out YouTube content. And somewhere between rusted and rested. Somewhere in there. But ready to go. Ready to get back into the swing of things. And of course, by the time you see this, it’ll be Monday, which means it is time for office hours, which means it is time for me to answer five. Terribly interesting questions from you, my beloved watchers out there. So we’re gonna get into that. Before we do, down in the video description, there are all sorts of helpful links. If it is 2026 and you have a brand new budget and you are just looking for ways to spend it, I have rehauled and updated my consulting and training pages. The consulting page has some new offerings on it. Slightly lower priced entry points for people who maybe need some help, but don’t need giant buckets of hours. And I’ve also redone the training stuff. So if you want, you can, you can, now the everything bundle is everything, everything. It includes the normal everything performance bundle, learn T-SQL and performance engineering.

And there are new ways to purchase that as well. There are yearly and monthly subscriptions if you are more keen on that sort of arrangement for purchasing training. So all that is overhauled and brand new for 2026. If you would like to, if the content that you see here just moves you in such a way that you would like to give me four bucks a month, buy me half a New York coffee, you can still become a channel member. That has not been, that has not been overhauled yet. Who knows what 2026 will bring. And of course, that is also where you go to ask me office hours questions.

So if you have a burning SQL Server question that you want to see answered live here, well, semi live here, I’m alive, I promise. Here on YouTube, you can do that. And of course, if you, if you think that I would be a valuable presence in someone else’s life, please do like subscribe and tell a friend so that I continue to reach the people I am intended to reach. I will be leaving the house. Well, a couple months from now, apparently data to Nashville, March 6th and 7th tickets are flying out the door for that. So hurry up and get your seats while you can. I have not been informed of the winged status of the data data Saturday Chicago pre con, but I assume that it is that is also quite winged. That will be March 13th and 14th and newly added to this list.

And why I had to remove the little flying database image over here because it was getting in the way of things is a SQL day Poland in Rocklaw, which I hope I am saying sort of right. And that will be May 11th to 13th in Poland. So you’ve got that going for us.

And welcome to, of course, the first video of 2026. We are all hopefully happy and not hungover. I count six databases and 12 arms and 12 legs.

So we are in good shape for 2026. Everyone’s got all their appendages, eyeballs, other other necessities, apparently. Plenty of well, they’ve all got drinks, too.

Looks like this is a good party. So is there, it looks like there’s a little pickle person in this one or something. I’m not sure what that is.

I had not noticed that before. Hmm. I wonder, I wonder what’s going on there. Anyway. Can’t, can’t win them all. Let’s answer some questions here.

Do, do, do, do, do. Ba, ba, ba, ba. Let’s see. The first one. What do we have here? This is all the big one, huh? I have a query that is timing out after 30 seconds and is mostly waiting on page latch SH.

Okay. Let me stop you right there. What does mostly mean? Does mostly mean 29 seconds?

Does mostly mean 29 milliseconds? What is mostly? It’s a clustered index seat for a small number of rows on a narrow table.

The query doesn’t show up as part of a blocking chain. Well, I don’t know. I don’t know how you’re judging that, but okay. Any suggestions for next step in finding the problem?

I don’t think it’s… Why would it be… Is it a select query or an insert query? Why can’t… Oh, I beg you, please give me the information that I need.

Please. I beg. So, look, the next step in this is always one of two things. You either get the actual execution plan, or you monitor the server for this query, and you see what is happening in near or around when this query is executing.

Those are the only two next steps. Or rather, those are the two most logical next steps. Because unless you are able to figure out what’s going on here, either via executing the query and seeing where things are actually getting stuck in the actual execution plan, or if…

I mean, because that… Which could totally not do anything for you. You could totally not see what’s interesting there. You could hit F5, it could finish instantly, and you’re done. Then you need to monitor things. You could use extended events or profiler or something else to watch live action on that table and figure out what is happening when that query is timing out.

I find it highly unlikely that page latch sh is your problem there. So that’s what you can do. I cannot tell you why with the set of information you’ve given me.

Next up. Did you ever use performance monitor for troubleshooting? And if so, what metrics did you use?

So no, not really. I tried a lot because, you know, like when I first got into like SQL performance stuff, there were a lot of articles and, you know, things about using Perfmon and exporting it to Excel and blah, blah, blah.

I just… I could never… I could never like wrap my… I could never get it to like give me the answers that I needed. So I couldn’t really tell you which metrics to use.

I can tell you that there is a nice fellow in the community who hasn’t blogged in quite a while, but used to have a really neat sort of… Not a traditional like query and index tuning thing, but like performance monitoring and like, you know, a lot of like hardware stuff.

The blog was SQL Sasquatch. And it’s a guy named Lonnie N… A bunch of underscores.

And he talks about Perfmon quite a bit in his blog posts and would list out the counters that he would use for various investigations. Yeah, like I’ve used the like sys Perfmon counters view in SQL Server to like look at some stuff, but it’s…

I’ve never like… I’ve never found a ton of answers there. Like every once in a while there will be something interesting in there, but mostly it’s just like… Yeah, look, you’re doing…

Actively doing things. Good for you. Like… So no, it’s never really been my thing. And then, you know, like with Perfmon too, you know, a lot of SQL Server monitoring tools sort of came around and gave you like a better picture of what was happening on SQL Server than you would get from those insane charts and graphs that Performance Monitor would spit out with all the different sort of like value resolutions and lines and stuff.

Like one thing that always tripped me up with that was like different lines had like different… Like sort of like…

I forget what the exact word is, but like resolutions to them. So like one of them would be like 1.000 something and it would always be up at the top and the other ones would be like 0.0001 something and always be like down at the bottom.

And you’re like, how do I make… How do I make sense of what’s happening here? There’s all sorts of like complicated stuff that just always made it a little bit too hard to use for me. Okay.

Okay. I never want to use dbcc shrink file on a data file again. Okay. Do I lose anything by totally abandoning it? On a client call recently, you were pretty specific about avoiding usage of the object ID function.

I noticed you use it in a demo recently. Fine enough. It’s just a demo. Any other such functions we should avoid using in production? Well, so like when I tell people to not use those functions, the reason why is because they do…

A lot of them… I don’t have the full list of ones, but a lot of them do not obey the same locking semantics that you might find by just accessing various system views to get that…

To resolve certain object names and stuff instead. Aaron Bertrand has a post in his Breaking Bad Habits series over on sqlblog.org, where he talks about, you know, like queries that use these, that get blocked where hitting sys.object, sys.scheme, sys.table, stuff like that, don’t get blocked.

So the main reason why I tell people not to use that in sort of like production code is because if the… Like you might… Your queries that run those might get blocked in places that queries that just use the underlying system tables and views wouldn’t get blocked.

So they can contribute to sort of like concurrency issues, blocking chains. I don’t know about deadlocks necessarily, but they could certainly get blocked in places where you would not find blocking from hitting the underlying sys views that would allow you to resolve those things in different ways.

You know, of course, the object ID functions are… You know, they’re quite simple for… You know, the type of procedures that I typically write for troubleshooting stuff.

But even I’ve tried to get away from using those in a lot of places there. And I just, you know, try to use the underlying system views instead so that I don’t get blocked up trying to resolve various object names to see where they are.

I’m not perfect at it, of course. You know, there’s lots of old code that I would have to go back and revisit and rewrite queries to sort of change the way that they are figuring out if various objects exist or not.

But, you know, one thing at a time. There’s only so many hours in the day and usually that stuff gets fixed as I hit a problem with something.

And if I’m not hitting a problem with something like pretty regularly, then I’m probably not going to jump on changing the code because, you know, it’s a lot of effort for not much.

Anyway, could you please send me the link of the SQL books? So I don’t know exactly what you mean here.

There is, of course, erikdarling.com slash books, which has a list of SQL books that I recommend. I think you might mean that one.

And then I guess that has to be it. So sure, I’ll put that link down in the video description with the questions and stuff. It’ll be in there.

So anyway, that was an interesting start to office hours of 2026. Thank you for watching.

I hope you enjoyed yourselves. I hope you learned something. And I will see you in tomorrow’s video. We’re going to spend January… I’m going to be doing little snippets of my new course, Get AI Ready with Erik, all about, you know, handling vector stuff with SQL Server 2025 and Azure and all where Microsoft has sprinkled AI into the database.

And by sprinkling in AI, I mostly mean there’s a vector data type and there’s a whole lot of stuff that you have to think about when you start using it. And hopefully, I’ve thought about most of it for you and I can just tell you what to do.

So that’s going to be what we’re going to focus on for the majority of this month. But I also have some bit obscene podcasts lined up with Joe and Sean and maybe some special guest appearances.

So it’ll be a busy January because we got to stay busy. And I will see you over in tomorrow’s video where we will talk about some vector stuff.

All right. Thank you for watching. Goodbye.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

Column Store, Delta Store, and More

Column Store, Delta Store, and More


Video Summary

In this video, I delve into some fascinating aspects of columnstore indexes and the Delta store in SQL Server, showcasing how data is inserted and managed within these structures. I walk through an interesting demo where I create a table, insert rows multiple times to observe changes in the Delta store and columnstore index, and discuss the tuple mover—a background process that compresses row groups but can sometimes disrupt my carefully planned demonstrations. Despite this little hiccup, the video offers valuable insights into how SQL Server handles data insertion and compression, providing a deeper understanding of these crucial components for database performance tuning.

Full Transcript

Erik Darling here with Darling Data, celebrating another beautiful Friday in the life of a SQL Server consultant. Lord knows why. In this video, we’re going to talk about some stuff with the columnstore, the Delta store, and a little bit more. This video doesn’t have one sort of overarching point to it. It’s sort of a few different spectacles, all in one lovely demo. So, we’re going to initiate ourselves, we’re going to avail ourselves of some interesting stuff. But before we do, I need to harass you about some other things, some financial priorities. If you need help with SQL Server, I am, like I mentioned, a SQL Server consultant. I specialize in SQL Server performance tuning, and you can hire me to do that for you. I will make your SQL Server faster in exchange for money. There’s a link down in the video description that offers you one of many ways to hire me to do that. There are also links down in the video description where you can buy my training, which is always reasonably priced. You can also choose to support this channel with money, as low as $4 a month, you know, pre-tax in my pocket, in order to keep me doing this sort of thing.

You can also do some other stuff, like ask me office hours questions. I do enjoy an office hours question, or five, every Monday. And of course, if you enjoy this content, you should probably like, subscribe, and tell the giant, massive friends who hang eagerly on your every word about this channel. So, I will be out and about in the world, oot and a boot for the Canadians. Data to Nashville, March 6th and 7th. And data Saturday, Chicago, March 13th and 14th. I will have advanced T-SQL pre-cons at both of those.

So, please do, please do show up. Because if you don’t, I will cry and drink and it will just be a real meltdown nightmare disaster on stage. And you don’t, you know, I mean, maybe you do want that. Depending on, depending on how much you’ll, you may enjoy that actually. I don’t know. But anyway, let’s go, let’s go talk about this columnstore stuff. So, what I’m going to do is, I’m going to do, well, of course, I’m going to drop a table if it exists. And I’m going to create a table. And then I’m going to load some initial data into a columnstore table.

And then, just for fun, what I’m going to do is, I’m going to reload those same rows into the columnstore four times. Right? So, we’ll say go three here. And every time we run this, so go is fun because it’ll start an execution loop when you put a number after it. And so, every time we insert rows into this table, we’re inserting a larger number of rows. Right? So, it’s like another million and then two million and then four million go into this table.

Now, something kind of interesting happens on this insert. And if we did this insert differently, and I’ll describe what differently means in a moment. But when we do this insert, SQL Server chooses to take data from a columnstore, spool it into a row mode eager spool, and then insert it into the columnstore. Right? So, this is, again, row mode. So, SQL Server unbatches all those rows to insert them into the table.

If we put that data into a temp table or even a temp table with a clustered columnstore index on it, SQL Server would not unbatch it. But because SQL Server row modes all that stuff, we end up with a bunch of sort of Delta store data. Right? We can see we have this clustered index.

So, the reason why this is funny is because if you go to the documentation for the view over here that we’re looking at, Sys.dmdb index physical stats, it’ll say, where is it? Down here at the bottom, maybe. Oh, no. It was up higher. This thing is lying to me.

Limitations. Does not return data for clustered columnstore indexes. Right? And, well, I mean, that’s true. There’s no data about the columnstore index in here, but there is data about the Delta store, which is a clustered index that sort of brain leaches onto the clustered columnstore and shows us uncompressed stuff.

Now, SQL Server does have a background process called the tuple mover. That’s when we ran that that time. There were fewer rows in there. So, SQL Server does have a background process called the tuple mover, which will, in the background, go find open row groups and attempt to, sorry, close row groups and attempt to compress them down.

So, when we look at the table now, we see we have some of these compressed. We have one of these that’s open. We have three of these that are still closed. Now, that’s all a side effect of, again, the row mode eager spool between the table and the insert.

So, now, if we run an alter index and we say compress all row groups, well, this will run for a little bit. And then, if we look back at this, now we’ll have these two tombstones in here. So, SQL Server is, like, sort of, like, compressed a bunch of other stuff, but left these two tombstones around.

Right? So, if we go and look at, like, what’s in here now, we go look at the physical stats for this. Now, we just have this one thing remaining. Maybe the tuple mover kicked in and did something. I don’t know. Maybe the tuple mover is ruining my demo.

Yeah, it is. See? Now, those two tombstone row groups are gone. Normally, I would have to run reorganize again a second time to force the tombstone sort of cleanup process to run. But this time, the tuple mover beat me to it.

I just timed this demo terribly because every other time I had lots of time to talk about this stuff before the tuple mover did anything. So, now this reorganize won’t do anything. But I do think that there is one interesting element to this in that if we look back at the physical stats for the columnstore thing, we still get one row back from here.

And there are a couple columns in that view that do reference columnstore data. Apparently, this is non-clustered columnstore stuff because, like, this just says not valid for the clustered columnstore one. So, I guess non-clustered columnstore, it does show more.

But what this is showing us now is an empty delta store. All right? Because we have no pages and we have no records in there. All right?

So, this is the empty delta store. SQL Server doesn’t remove the delta store entirely. It just leaves this empty thing hanging about just in case we need to put more data in there. If I ran, like, a bunch of inserts again to this table, then I would end up with probably rows in the delta store as long as SQL Server continued to choose that eager row mode eager table spool in between the insert and the select. So, as long as we still have this thing unbatching rows, then we end up with a delta store.

Even though we are inserting a number of rows that qualify for compression. So, that’s fun. Anyway, I found this all terribly interesting and also terribly annoying.

And despite the tuple mover kind of ruining one of the highlights here on needing to do the double reorg, I’m still okay with it because, you know, at least we got to see the tuple mover work in action, right? We got to see that it actually does something.

It doesn’t just sit there idly waiting for us to do things. It wakes up about every five minutes or so and says, Oh, look at all these row groups. They’re closed.

I got to compress them. And it does that. You can turn that off, but we’re not going to talk about that. Anyway, it’s Friday. I’m tired. I don’t want to do this anymore.

So, thank you for watching. I hope you enjoyed yourselves. I hope you learned something. And I will see you back Monday with our usual office hours thing that we do on Monday. Because that’s what we do on Monday.

All right. Have a great weekend.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

Merry Christmas from Santa and Definitely Not Erik

Merry Christmas from Santa and Definitely Not Erik



Would you believe Erik Santa forgot to turn off this coupon?

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

More LOBs, More Locks (in SQL Server)

More LOBs, More Locks (in SQL Server)


Video Summary

In this video, I delve into the intriguing world of large object (LOB) data and its impact on locking in SQL Server. Erik Darling from Darling Data shares his insights by demonstrating how incorporating LOB columns like text or ntext can significantly increase locking problems, particularly through examples involving exclusive and shared locks. By exploring a specific update query that targets a single row with a LOB column, I illustrate the complexities of lock management and highlight the importance of proper indexing to mitigate these issues. This discussion is not just theoretical; it’s practical advice for anyone dealing with large data types in their SQL Server databases.

Full Transcript

Erik Darling here with Darling Data, feeling the spirit of the season upon me. And what I want to give you is the gift of lob data. And tell you a little bit about how it can increase locking problems in SQL Server. Shocking, I know. Down in the video description, you will find all sorts of helpful resources to give me money. If you want to hire me as a consultant to make your SQL Server faster, you can do that. If you want to purchase my training material, let’s say you want to learn all the things that I know, and you want to go forth into the world and tune SQL Server on your own, you can do that. For a very reasonable price. You can also support this YouTube channel, Ask Me Office Hours Questions, and click on the requisite buttons to like and subscribe and share with friends all of this wonderful, magnificent, free content. content about SQL Server. Isn’t that just grand? The spirit of the seasons. The spirit of the season is upon us, my friends. Once the spirit of the season leaves us, and the new year begins, and the earth, at least in this hemisphere, begins to thaw, I will be leaving the house once again, like the spring flower that I am. I will be at Datatune Nashville, March 6th and 7th.

And I will be at Data Saturday Chicago. So much data. So many datas. So little time. March 13th and 14th. I will have advanced T-SQL pre-cons at both of those. And I would suggest that you come to those if you want to learn some crazy T-SQL stuff. Because that’s what I do. That’s what I do. That’s what I’m into. That’s my fetish. Crazy stuff. Anyway, let’s talk about lobs and locking. Now, over here, we have an update query. And this update query will seek to update a single row in the post table, where the ID, which is the clustered primary key of the post table, equals 1169.

All right. And we’re going to use my table valued function, what’s up locks, which is available at my GitHub repo. It’s at code.erikdarling.com. I actually had to do a big rewrite of this before this because nothing was wrong with it normally. But for this demo, performance was not good. But we can save that for another day. So let’s go update that single row. And let’s look at how involving lob data in a query makes for some interesting locks.

Now, if we look down in the results of what’s up locks, we will see that request session ID 57, that is this SPID, right? That is us here. It has some intent exclusive locks. But more importantly, it has this exclusive lock at the key level. And it has just one of them, right? So there’s just one total row lock here, right?

So what we’re going to do now is come over to this window and look at when some different queries get blocked up here. So if I select the top 100 rows from the post table, where ID is greater than 900, this will actually start returning rows. All right. But this will get stuck at a certain point. Like if we scroll way down in the results, we have returned up to row, we have returned 90 rows and we have stopped at 1149.

But now we are blocked. And if we come look at the locks that my query is attempting to take, we will see that this one shared lock is being blocked, right? This is the one that’s waiting, right? So we do not, we are not able to get this lock. So this query was able to read up to a certain point until it hit this one lock and now it is stuck, right? Because we have not committed that other transaction.

So this one gets stuck there, right? Let’s say that now we want to order by score, right? So now we are ordering by the score column. Things will get blocked in a far different way, right? This one does not even start returning results. And part of the reason why this one does not start returning results and part of the reason why I had to tune WhatsApp locks pretty heavily this morning and still add a bunch of query hints to it so that the optimizer stopped doing, gosh, dumb things all over the place.

Sorts, stream aggregates, merge joins. It was a bad scene, man. This thing was taking like 40 seconds to run. I got it down to about five seconds, which is like, like glory, hallelujah. It’s a Christmas miracle. How dare you? But look what happens now. This is the important part, right?

It’s like we have this, this thing down here that’s waiting, right? But look at all of the lob locks that have been taken, right? All these shared locks. We now have 417,000 Hobbit lock counts, heap or B tree locks.

And that, that results in, well, I mean, 4,000, 4, 417,000. Wait, is that 417,721? Oh, oh my. That’s a seven digit number. 4.1 million page locks. And this is because SQL Server is now has to follow all of the sort of the lob pages for the body column. All right. And we can, we can prove that’s true because if we, if we change the query a bit, right?

If we cancel this thing, cause that’s going nowhere and we say, select the top 100 and we, we leave the body column up. We’re just going to select the store, the score column. Now, if we run this, this gets blocked like in the, in the same way, but we don’t take that for those 4.1 million locks in here, right?

There’s no 4.1 million number in that one. So, so just selecting that lob data will contribute to additional like locking stuff in shared locks needing to happen. Now that would also happen if we just had body in the where clause. If we said where body is like, you know, just a single character, just find me any body, any body will do.

This thing will go back to taking like five seconds again. And we end up, well, this time we end up with, let’s see, 1720804. So still a seven digit number of 1.7 million heap or B tree lock counts. So just involving lob columns anywhere in your queries can lead to increase locking requirements and increased locks and other unpleasantness, probably blocking and some other stuff.

We could of course, uh, ameliorate. I think that’s, I think that’s the appropriate word for what would happen here. We could solve the problem. I, I ought to just speak plainly sometimes. Uh, if we had an index, uh, like this, that we are, we could return the data.

We would have far fewer problems going through things. Um, so like we would, like we would have the, like be able to sort the data by score and then find whatever IDs we care about. Everything would be fine. We wouldn’t even necessarily need to have ID as a second key column here because it is the clustered primary key.

And it would end up there anyway, if we just had a single key column on score. But then someone, someone might come along and fiddle with our index and add other columns in here and screw us all up. So sometimes it’s best to be explicit about, uh, the, the order of columns when we have even a clustered primary key involved.

And we want that to be where it is for a reason. Then we could maybe add other columns over here. Like, I don’t know, accepted answer ID without messing up the order of the initial keys that we, we want in there.

Anyway, strings were a mistake. Big strings were an even bigger mistake. And as much as possible, you should avoid max data types in the database.

There are a bevy and variety of reasons for, uh, for avoiding them. Uh, and I’ve talked about many of them, but this is, this is yet another one. Anyway, thank you for watching.

I hope you enjoyed yourselves. I hope you learned something and I will see you maybe tomorrow’s video. I don’t know. I might, I might, I might take some time off for Christmas, New Year’s and stuff. We’ll see how it goes.

No promises. But if, if there, if there is a video tomorrow, I will see you in it. All right. So, thank you for watching. Thank you for watching. We’ll see you next time.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

Query Store Hint and Plan Guide Annoyances in SQL Server

Query Store Hint and Plan Guide Annoyances in SQL Server


Video Summary

In this video, I delve into some of the frustrations and annoyances associated with query store hints and plan guides in SQL Server. I explore how these tools can sometimes hinder rather than help, particularly when trying to override certain behaviors or improve performance. For instance, I demonstrate the quirks of using a `NO_WAIT` hint in a transactional context and highlight why Query Store’s inability to support table hints is such a significant limitation. Additionally, I discuss the cumbersome nature of plan guides, especially their requirement for maintaining semantic affecting hints that might be detrimental to query performance. These issues underscore the need for more robust and flexible tools within SQL Server to better meet the diverse needs of database administrators and developers.

Full Transcript

Erik Darling here with Darling Data. And today we’re going to talk about some of my annoyances with query store hints and plan guides. If you have a problem with that, you can leave. Before you leave, before I talk about that, there are some links down in the video description. If you want to hire me as a consultant to make your SQL servers faster, in exchange for money, you can do that thing. You can also buy my training if you want to try to do that on your own, learning from me, because I’m good at it. You can also support this YouTube channel because, I don’t know, I mean, I guess the camera’s already paid off, but, you know, these ring lights, electricity ain’t getting any cheaper, right? Something like that. If you want to ask me office hours questions, I do that every Monday. I answer five of them, as long as there are five to answer. If there are five to answer. If there are not, I sit in a corner and cry and wonder why no one likes me. But, anyway, I’ll be leaving the house starting in March of 2026, specifically going to Nashville and Chicago, one weekend after the other. Data Tune Nashville and Data Saturday, Chicago. Good times ahead. I will be teaching advanced T-SQL pre-cons at both of those.

So, buy those tickets now, because they don’t last forever, kids. Didn’t always have it this good, did we? Couldn’t just always leave the house and go get some SQL Server training outside. I don’t know, right? Anyway, let’s talk about why these things are annoying. So, that’s not what we needed. We needed Management Studio, not Hyper-V. Hyper-V is an unserious technology for consultants like me. Anyway, let’s truncate this table called bad idea, and let’s stick five rows in the bad idea. And so, what I want to show you here is, I mean, it’s just kind of a funny hint, right? This no wait hint. Under normal circumstances, no wait won’t do anything, right?

But, like, if we, like, say, start a transaction, and we try to insert a row into this table, we don’t close the transaction. Then no wait does kind of a funny thing. It actually just does, like, what a lock, like, the lock timeout thing does just immediately, right? I mean, it still gives us, like, the first five rows, but then on the sixth row, it bails, right? It’s like, ah, there was a lock. I’m gone, right?

So, it doesn’t return results. We could select, sorry, we can’t select ID6, because ID6 is the row that we have locked for the insert, but we could select ID5. Why we have this hint, well, I mean, I don’t really know or care all that much, but we’re here to talk about plan guide annoyances. Now, let’s say that we wanted to get rid of this hint. Let’s say that this hint was ruining our lives.

Let’s say that this hint were completely destroying everything that we believed in as far as SQL Server goes. We’re like, why does this query just, like, just die the second it encounters a lock? Is that, should we be doing that? Is that correct?

Like, maybe we should use lock timeout, make it wait a little bit longer. But, so, Query Store has the ability to add hints to queries via the option mechanism. I don’t know why ZoomIt is being difficult, having some trouble starting up. Apparently, it’s cold.

But, if we were to try to add a table hint, the way that we can add a table hint with plan guides, well, Query Store gives us a rather infuriating error. Query Store says, no, you cannot set query hints, or rather, setting query hints table hint in Query Store is not supported.

Query Store says, no, you cannot set query hints. Well, Microsoft didn’t think very hard about this one, did they? I mean, they thought hard enough to make an error message, but what a giant gap in capability.

Query Store says, no, you cannot set query hints. Someday, competent people who care will work on SQL Server again, I’m sure of it. We don’t know when, don’t know how, but someday it’ll happen for us.

Now, with plan guides, you can supply table hints, but plan guides also have a rather annoying thing with them. Where, if we tried to create a plan guide like this, and the cool thing about like table hints is you can use them to ignore a lot of stuff. Like, if there was like an index hint on this thing, and you were like, wow, that index hint is stupid, you could use a table hint like this, and just point the table hint to the alias for the table, and that index hint would effectively be ignored for the query.

But we can’t do that with like walking hints, right? If we try to run this, SQL Server will say, well, let’s put this in a way that is a little bit more easy to digest when I zoom in. Let’s see.

Cannot execute query. Semantic affecting hint no weight appears in the with clause of the object bi, but not in the corresponding table hint clause. Change the option table hints clause so the semantic affecting hints match the with clause.

Now, I might very well be using a plan guide specifically because I want to override that semantic. Maybe I hate that semantic. Maybe that semantic is hurting things.

Maybe I want to get rid of it. I wish that this thing had like the PowerShell like dash force parameter for it. Because it’s like, no, I specifically want this to not happen.

I’m trying to make this not happen. Why are you making me do this? So that’s my annoyance there.

With query store, it’s just like, no, no table hints whatsoever. With plan guides, it’s like, no, but you need to keep this thing because we say so. And you’re like, no, I want to get rid of this thing.

This thing is actively hurting me. Now, like, you could do that with a query like this. And you could, like, if there was an index hint, you could say, hey, like, let’s ignore that index hint.

That is completely successful. And, of course, we could get this to work with the table hint down here. But it would completely undo what we’re trying to do, which is not have the no wait hint on here.

And this will happen with, like, lots of dumb semantic affecting hints because SQL Server is like, oh, no, the semantics. But you’re like, I want to get rid of the semantics. But you’re making me keep these semantics that I don’t want.

So it would be really nice if with, you know, query store, we had the ability to supply table hints because what kind of half-baked situation is it that we don’t have that? And with plan guides, it would be nice if you could say, no, plan guide, I really do want to override that semantic affecting hint because I don’t like it. I think it’s bad and I don’t want it there.

So that’s that. Neither one of these things exactly, neither one of these plan forcing features exactly covered itself in glory here. Query store, of course, with the completely half-baked query hints and plan guides with not allowing me to get rid of the semantic affecting hint that I intended to get rid of with a plan guide.

So thanks for both of those. Nice Christmas presents. All right.

Happy Merry. Thanks for watching. I hope you enjoyed yourselves. I hope you learned something. And I will see you in tomorrow’s video where I think we’re going to talk about how Wob data makes for some weird locking. And we’ll have a lot of fun with that, won’t we?

All right. Thank you for watching. Thank you. Thank you.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

SQL Server Performance Office Hours Episode 46

SQL Server Performance Office Hours Episode 46


  • The missing index DMVs keep recommending the same index with 15+ included columns that would basically duplicate our clustered index width. The equality columns make sense, but I’m suspicious about blindly adding all those includes. How do you evaluate whether a wide covering index is actually better than a narrower index with key lookups, especially for an OLTP workload?
  • I thought table variables were immune to parameter sniffing issues because they’re not parameterized, but I’m seeing wildly different performance for the same stored procedure depending on the data volume in a table variable we populate. The cardinality estimate is always 1 row regardless. Is there a way to get better estimates for table variables, or should I just rewrite everything to use temp tables?
  • We’re getting 1205 deadlocks on SELECT queries that don’t modify data. The deadlock graphs show intra-query parallelism exchanges as the resources. These are reporting queries with MAXDOP 8 hitting a columnstore index. Why would read-only parallel queries deadlock with themselves, and what’s the best way to prevent this without just setting MAXDOP 1?
  • I’m getting sort warnings in execution plans even though the memory grant is 10x larger than what the sort actually needs according to the plan properties. The ‘GrantedMemory’ is 500MB, the sort only uses 50MB, but I still see a sort warning. I’ve ruled out parallelism issues – this happens even with MAXDOP 1. What causes sort warnings when there’s obviously enough memory?
  • I created a filtered index on WHERE StatusId = 2 for a common query pattern, and the index is way smaller than the full table index. But the optimizer still chooses the full table index with a residual predicate instead of using my filtered index. I’ve updated statistics, rebuilt the index, even added query hints – nothing works. What are the gotchas that prevent filtered indexes from being selected, even when they seem perfect for the query?

 

To ask your questions, head over here.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

New Course: Get AI-Ready With Erik!

New Course: Get AI-Ready With Erik!


Video Summary

In this video, I introduce a new course titled “Get AI Ready with Erik,” designed to help you navigate the upcoming AI features in SQL Server 2025 and beyond. Given that Microsoft is positioning SQL Server as AI-ready, it’s crucial for professionals like us to be prepared to work with these new capabilities. The course focuses on practical applications rather than deep theoretical knowledge, teaching you how to leverage vector search effectively without getting bogged down by the complexities of underlying mechanisms. We cover everything from foundational concepts to advanced topics such as hybrid search and multi-vector search, ensuring that you can make informed decisions about when and how to implement these features in your production environment.

Full Transcript

It’s Friday and we’re introducing a new course. It’s called Get AI Ready with Erik because Microsoft says SQL Server 2025 is AI ready and you need to be ready to work with AI in SQL Server. Whether you care about it or not, whether you think it’s good or bad or not, at some point you’re going to have to deal with these new features and I want you to do it in the least stupid way possible. The whole foundation of the course is based on this premise. You don’t need to know how it works. You don’t need to know all the crazy math. You don’t need to know all the weird theory. You don’t need to know all the, you know, gears and whistles and bells and stuff that gives people six fingers on the, on the backside. You just need to know how to work with these features in smart ways. That’s what I want to teach you. The whole thing started because I do SQL Server performance tuning and I saw this whole thing, the whole suite of new things hit SQL Server and I said, gosh, the way things are going, I’m going to be a little bit more. I’m probably going to have to do some performance tuning work on this. And it turned out on my path to learning how to do performance tuning work on this. I had to learn a whole lot about how all the stuff worked inside. So that’s my distillation here, how all this stuff works. Now you can work with it too. Vector search is fairly easy to grasp as a concept. It stores these embeddings, which are a series of floats, which might, I guess, store it as a JSON. And then when you change other text into a similar, embedding another array of numbers, vector search can tell you how close, how similar those strings are without a bunch of nasty wildcard searches and stuff, right? No longer just like saying where something is like, you know, percent word, percent word, percent word, or like, you know, just long strings of or clauses doing this stuff. It allows you to sort of search by meaning and intent, not just like, you know, normal keyword searches. So you can find similar things pretty fast, even on pretty big tables.

It’s not perfect though. Details often get lost. And sometimes you still need to do some keyword searching and sometimes you still need quality filters on things to make sure that, you know, not only is the content that you’re providing in your search is relevant, but also that it’s not just like garbage, right? So, you know, that’s no fun. Microsoft is, you know, Microsoft is, you know, of course gets a late pass on this one. That’s, you know, behind the curve a bit on this. Most other, you know, of their competing databases have had this stuff around for a while, but, you know, moving data to another database, as I’ve learned from seeing so many people talk about how they’ll be in Postgres in three to six months and they never seem to get there. Your data is already in SQL Server, moving it off to either a specialized database or a whole other data platform just to take care of vector search requirements is probably a non-starter for a lot of people.

You’ve got everything already in SQL Server. Adding a vector column or a table with a vector column in it beats migrating for most people in the world. Microsoft was late, but, you know, they actually showed up. You have some generally available features available right now and you have a lot more features still in preview.

Will they ever be real? I don’t know. Will Microsoft ever finish anything again in SQL Server? I have no clue, but I can tell you what’s there and what’s not. So this is what the course covers. It’s in eight parts over around 28 videos. It’s about 10, 12 hours of total content.

We cover foundations, practical applications, hybrid search, vector indexes, native integration, managing embeddings, production patterns for dealing with all this new vector stuff. And then some more advanced topics like rag and multi-vector search. It’s a pretty solid course, I think.

And one thing that I want to say here is that, you know, where a lot of this stuff is in preview and like maybe isn’t out yet in SQL Server 2025 is RTM. Like we don’t even have one cumulative update for it as of this recording. I’m going to be adding in stuff as cumulative updates add, you know, substantive changes and new stuff to SQL Server.

So this isn’t like a course that’s just going to rot on the vine. As Microsoft works on this, the course will get updated alongside it. So there will be new content to cover new things as the features move along.

What I hope you get out of this is understanding when vector search makes sense. Because when your boss comes along and says something like, hey, we want to do this and here’s what we want to do it for, you can speak intelligently about it. You have some background.

You have some knowledge. You have information. You can feel empowered to speak authoritatively on the subject. Right. We want to know what’s production ready versus what’s still in preview today, how to monitor and troubleshoot performance issues and sort of like capacity planning and all the other stuff that goes along with, you know, DBA and developer tasks. And we want to know how we can implement this in production and make working with the current set of limitations as pain free as possible.

So right now, if you go to training.erikdarling.com, there should be a link down in the video description directly to this course as well. You can use the coupon code AIREADY to knock $100 off the cost of the course. That coupon code is available for the first 100 people who sign up.

So if you want to be one of the lucky hundred to save $100, I suggest you get moving because things are already flying off the shelves and, you know, I would hate to see you miss out on this wonderful opportunity. Anyway, thank you for watching. I hope you enjoy the course and I will see you, well, when I see you.

All right. Goodbye.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.