Get AI-Ready With Erik: A Little About Generating Embeddings

Get AI-Ready With Erik: A Little About Generating Embeddings


Video Summary

In this video, I dive into the world of vector columns and embeddings in SQL Server 2025, specifically focusing on Azure SQL Database and Azure Managed Instance. I explore the challenges and considerations involved when adding a vector column to an existing table, such as the need for indexing and potential performance impacts. To illustrate these concepts, I walk through creating a demo table called `post_embeddings_demo` and demonstrate how to use Ollama—a free and user-friendly tool—to generate embeddings for text data. By the end of this video, you’ll understand the process of backfilling your tables with vector data and appreciate the significant storage requirements involved. Whether you’re looking to enhance your SQL Server skills or simply curious about integrating AI into your database operations, there’s plenty here to learn. Don’t forget to check out my training site for the full course, where you can get a $100 discount using the coupon code AIREADY and Get AI Ready with Erik!

Full Transcript

Erik Darling here with Darling Data, returning once more into the vector void to try to sell this course. To try to help the community learn more about vector and SQL Server 2025 Azure SQL database and Azure managed instance. Sell this course, which you can buy from me now.

The, the low, low price of $100 off with the coupon code AIREADY. The course is of course, of course, of course, of course, called Get AI Ready with Erik. It is available over on my training site.

This is just small snippet teaser material from the aforementioned course that will hopefully spur you into buying this. So you can, you can show this to your boss and say, this, this, this man is trustworthy and will teach me all about AI and SQL Server. AI, right?

Anyway, got stuff to do, don’t we? Uh, now, we’re, we’re going to be completely honest with you. Uh, you most likely, um, in your heart of hearts, in your brain of brains, your guts of guts, will know that adding a vector column to an existing table is probably not a great idea.

Because, not only do you have to add this column to the table, which if it’s, you know, there are all sorts of metadata only things that make that not a big deal, but now you have to fill that column up. Uh, and if the, the, the table that you are attaching this column to is of any importance, um, you’re going to have a couple issues with it. One, uh, you know, the, the, the indexing for this table may need to start taking this column into account in some way.

Um, of course, you cannot have a vector column in the key of an index, but the includes might, might, might come into play some, at some point. Um, and now your table is about to get a lot bigger. Not only that, but you, you have to, you have to batch fill your table with this vector data.

If it’s a large table, you certainly don’t want to try to fill it all in one shot. So most likely what you will want to do is create a sort of lookup table for this data, where it can remain unattached from the rest of the data that you actually care about. But you can still, uh, you can still access fairly easily.

So in, in our case, uh, what we, what we, what we are going to do, or rather what I have done is created a sort of side piece table called post embeddings demo. Um, there’s an actual full post embeddings table in the database that I distribute for this course. But, uh, I use one called underscore demo so I can show you what filling, what backfilling, uh, vectors looks like, or embeddings looks like.

So, uh, this table is pretty simple. We’ve got an ID column. Um, we, we mostly, I mostly have this here because in later, um, modules, I guess we’ll call them.

They’re still YouTube videos, but in the, in the real course, they’re modules. So very, very fancy. Uh, we, we, we do explore vector indexes.

And one requirement for vector indexes is to have an ID, ID, integer column is the prime, is the clustered primary key for the table. Otherwise you, you cannot create a vector index. You get all sorts of errors and you can’t have a big int.

Can you imagine? You can’t have a damn 2026 and you can’t have a big int. We get kind of, who designed this? Oh yeah.

God. Anyway, uh, we’ve got some other stuff going on here. Um, you know, just sort of normal things, uh, as well as a fabulous foreign key, uh, that points back to the post table. Uh, for, you know, referential integrity and all that good stuff.

So here is our vector column up here, right? Here is our embedding column and is going to be a vector 1024 float. The reason it’s going to be a vector 1024 float is because of the embedding model that I am using currently for, um, uh, for, uh, generating embeddings, uh, based on, uh, the stuff that we’re generating embeddings for, right?

So, um, what we’re going to do is look at what generating embeddings will look like. Right? So we’re going to use this demo table to do that.

And what we’re going to start with is, uh, embedding the titles, right? So we’re going to start taking the text in these titles and turning them into those crazy arrays of floating point numbers, uh, enclosed in square brackets that mean things to computers, right? That mean things to the AIs that allow them to say, is this similar?

Hmm. Hmm. Let me see which direction this arrow points in. So, uh, that’s what we’re going to be doing, right? And to do that, we are going to use, uh, the wonderful and fabulous Ollama, uh, because it is free and it is easy.

And one reason why it is particularly good for this course material is because, A, it’s free and easy, but also, um, if you were to use something else like OpenAI or whatever, like if you’re in like Azure or something and you’re like, I want to use OpenAI. Let’s line Sam’s pockets with more crap. He needs another, he needs four more Lamborghinis.

I don’t know. Uh, you could, you could use that, but then we would all, you would have to pay per API call, right? You don’t have to do that with the local thing. So what we’re going to do is, uh, we’re going to go look at running a Python script to do the embedding.

Now there are, there are ways to generate embeddings locally in SQL Server that we could like, like, we’ll, we’ll get to where like, you know, you can, you can say, Hey, uh, I’m going to use an embedding model. Uh, and I’m going to, you know, call the AI generate embeddings thing. And with that model, and you could do that here, but it’s real slow.

All right. Like it’s like, like absurdly slow to do that, like within SQL Server. So, but I’ve got a little Python script, I don’t know, year 2026, Erik Darling with a Python script.

Who would have guessed it? At least it’s not PowerShell. You can’t get me on that one. I never said I hated Python.

It is sort of like coding with crayons, but you know, at least it’s not PowerShell. So I’ve got my, my VM over here and I have, um, did I install zoom it on the VM? No, I didn’t.

That’s okay. It doesn’t matter much. Um, we have this over here is the Olama server, right? So you, you, you can see right there, it says Olama serve. So this is where Olama is serving stuff from. And I think Olama is sending me, uh, some, some subliminal messages, but I, I just, I’m just too stupid to understand them.

What does it all mean? I don’t know, but this is, this window is not, not, not so important, right? This, this is not the important window.

The important window is kind of over here. So I’ve got, um, this is the Olama version that I’m currently running. And this is the model that I pulled down the MX BAI Embed Large. The MX BAI Embed Large is a pretty good model, right?

It’s a stunning model. Great on the runway. Uh, and it generates, uh, 1024 dimension embeddings. So, uh, that is why we have chosen vector 1024 for our column, for our vector column.

And SQL Server. So, uh, if we run this and we say Olama run MX BAI Embed Large, hello world. And we let that hang out for a second.

You’ll see that all sorts of stuff happened over in our Olama server window. I don’t know what any of this means. This is, I mean, lunacy, right? We’ve got, I don’t know, Bert. I don’t know who Bert is.

Uh, we generate 1024, uh, things. And you can see that our context length is 512 tokens right there. And, you know, so, you know, that like the part of the reason why I was griping about like the, the stack overflow bodies, right? Is because they’re long and you run out of those kind of quick and we got to think about chunking our bodies.

Uh, I’ve had enough body chunks, man. Anyway. Uh, when all this stuff runs and does its, you know, fancy AI things to, to the, to the help to hello world. Hello world turns into this, right?

Look at all these numbers. There are, I guarantee you, if you counted them, if you pause the video and you count as I scroll through, there will be 1024 of them. And these 1024 floating point numbers, uh, apparently describe hello world to, uh, to, to, to computers in a way where it can understand if something else is similar to hello world.

Magic, right? Absolute magic. So, uh, we’ve got our Python script over here and our Python script is going to, uh, call Olama, much like we called Olama for a single thing there.

And it is going to, uh, fill 1,000 rows in the post embedding demo table with, uh, with embeddings just like that. Right. Just, just like this, they’re all going to end up in there. So let’s, let’s run this thing and let’s see.

We are fetching up to a thousand questions without embeddings. Uh, and now Olama is real busy over here. Right.

Yeah. Uh, embeddings require, but some input tokens are not marked as outputs. Yeah. Look at it go racing along and we’ve got, well, you see, even Olama is not incredibly fast with this. Right.

You know, like, uh, like doing it in SQL Server is pretty, like, I wouldn’t do it for more than like, you know, maybe like five, 10 rows in a batch. Olama, we can, we can bump things up a little bit, but you know, uh, it, it, it moves along. It’s, it moves along a lot faster than SQL Server does.

So you can see, uh, our, our thing has finished. Uh, we have embedded 1000 rows. We have no errors and it took about 40 seconds for a thousand rows. So keep this, I mean, and like, you know, this is maybe not the grandest VM in the world, but just kind of keep these speeds in mind when you’re thinking about like backfilling your data with stuff.

Uh, you know, if, if you were an actual purchasing person of the course, you would get the Python script that does this. So I don’t know, maybe that would be useful to you. Then again, with, with AI being what it is, you can probably just tell it to make a Python script to do that for you.

But yeah, mine’s pretty good. I’ve added some stuff to mine. I’ve zhuzhed mine up a bit.

So, uh, now if we go look at our embeddings table, right? We have, uh, let’s see. I mean, there are, yeah, um, thousand rows in there, right? Thousand rows of embeddings.

And, uh, when we look at what we’ve got in here, we have our, uh, you know, stuff from the post table and stuff from the post embeddings table. This is just a small preview of what ended up in there, right? So all those floating point numbers.

So if we want to know how to horizontally center a div and another div, then this is what the, some of the, the, the floating point numbers that describe how to div a div and a div a div would look like to a computer. And we can see some information about our embeddings, uh, via sys.columns, right? We have a vector dimensions and a vector base type description column, which tell us how many dimensions our column is 1024, which we knew when we created the table and the base type is float 32.

This might be more useful to you if you were dealing with a system where, um, you were unfamiliar with their current vectorization of things. Anyway, uh, we also have a couple new functions in SQL Server that SSMS is pretty damn lazy on. Uh, here I am using SSMS 22.

Um, granted it’s not the newest 22.1 yet cause I’ve been busy and I can’t, I can’t upgrade SSMS every two minutes, but, uh, this vector property function is completely unrecognized by SSMS, but we can still get valid results back from it. We still get, we still get results. I promise.

Now, if we look a little bit at what the size of things look like, and hopefully I highlighted that from the correct point. Um, we have 1000 rows, uh, rather we have 1 million rows in our, in our table. Uh, well, sorry, let me back up a little bit here.

Uh, this is, I’m looking at the real post embeddings table. Now I’m not looking at the demo table with just a thousand rows in it. Okay. So like, this is, this is the actual table that I’ve already gone through and embedded everything in like all million embeddings. Right. So this is not just the demo table with a thousand rows.

And let me clarify that before I make SQL Server look worse than it is. So in our actual, uh, post embeddings table, we have a little over a million rows and, uh, are the size of our embeddings is about 4.2 gigs. Right. And this is just for like the, the titles and stuff. Right.

So this is, you know, pretty big. Um, it’s, you have to do a lot of interesting size forecasting. If you want to add this to a database, cause, um, it’s a lot going on.

Anyway, we’ll talk more about that later, but, uh, we, we, that, that gives us about, uh, about 4 megs per 1000 rows with the float 32, uh, and vector type. So, you know, we do have to think a little bit about, uh, how we’re gonna, how we’re, what, what, what data is really meaningful to embed because, um, you know, it takes up a lot of space.

Well, I don’t know, it’s a lot of space. No wonder storage vendors are so excited about, about vector data types. Cause, uh, sell a lot of diskettes. Anyway, that’s probably good for this one.

Uh, thank you for watching. I hope you enjoyed yourselves. I hope you learned something. And, uh, again, this, this, this, the entirety of this course is available for sale and purchase at my site here. You will find the link, this full link down in the video description.

If you would like to get a hundred bucks off, uh, you can do that. And you can Get AI Ready with Erik. Alright, thank you for watching.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

SQL Server Performance Office Hours Episode 48

SQL Server Performance Office Hours Episode 48


Hi Erik, Happy New Year! On my SQL Server production system, the top waits include CMEMTHREAD and NUMA waits, accumulating a few hours of wait time. I couldn’t find much information online, but I believe that this may be related to parallelism or bad MAXDOP settings. What should I check next? My MAXDOP and CTFP are set to reasonable defaults, and I want to determine if these are causing unseen background issues.
Hi Erik, Happy New Year! On my SQL Server production system, I see that the VDI_CLIENT_OTHER wait is at the top of the wait list. Is this connected to slow AGs? Can I ignore it safely?
We increased MAXDOP from 4 to 8 and the query stopped spilling to tempdb and actually ran faster. Isn’t more parallelism supposed to make things worse?
We replaced a CTE with a temp table and the query got faster, so CTEs must be inherently slower. Should we avoid CTEs for performance?
We captured the execution plan when the query was fast and when it was slow — they’re identical. Since the plan didn’t change, this can’t be a query problem, right?

To ask your questions, head over here.

Video Summary

In this video, I dive into some interesting SQL Server questions submitted by the community. Erik Darling from Darling Data shares insights on topics ranging from CMEM thread weights and trace flag 8048 to the performance implications of parallelism and CTEs versus temp tables. The discussion is both technical and practical, offering real-world advice for managing SQL Server environments. Whether you’re dealing with AG-related issues or optimizing query plans, there’s plenty to learn that can help improve your database management strategies. I also explore the nuances of parameter sensitivity and data distribution, providing a deeper understanding of how these factors impact query performance. By sharing this knowledge, my goal is to empower SQL Server administrators and developers to make informed decisions about their workloads and configurations.

Full Transcript

Erik Darling here, Darling Data, home of the most reasonable rates on the internet. It is Monday, unfortunately, and I apologize in advance, but at least on this Monday you get some office hours. You get to hear me answer five community submitted questions. If I ever get a palm tattoo, that’ll be, maybe I’ll get the number five tattooed on my phone. I don’t know. Some Roman numerals. It’s a binary for five or something. I don’t know. Something dorky, right? Five user community submitted questions to me, yours truly, about SQL Server or whatever. I do say it’s SQL Server questions, but people have asked me about things that I never dreamed anyone would care. ask me about. So, that out of the way. Down in the video description, you’ll see all sorts of helpful links. If you haven’t clicked on these links in a while or ever, you might be surprised to see that I have redone a lot of things in there. There are all sorts of updated consulting packages, some of them at lower price points for those of you who need different types of consulting. There are different types of consulting than things that a typical bucket of hours might cover. Down in the training section, too, there have been some updates there. I have bundled, I have re-bundled the everything bundle to actually be all of my SQL Server performance content into one helpful package. And there are now, you now have the ability to buy that with a subscription. So, you can choose a yearly subscription or a monthly subscription.

If you want to check out that training. You can also, for as few as four US American dollars a month, throw a little support towards this YouTube channel for all of the incredibly high quality content that I provide here. And you can also find the link to ask me office hours questions and of course, the usual shindig where we like subscribe and we tell our friends about the wonder and glory of the darling. And the new day to YouTube channel. I will be leaving the house quite a bit. There are going to be well between one and three more of these showing up in the near future. I will be performing my I’ll be prosecuting my my advanced T-SQL training at all three of these events. I have pre cons at all of these data to Nashville data Saturday Chicago and SQL day Poland. starting in March of 2020.

So, I will be there dressed festively, you know, festooned to the gills and all sorts of things. Poland might be the first place where my all over Adidas gear is finally appreciated. So, see about that.

But anyway, it is, I mean, I don’t, I don’t know. I think, you know, like it, New Year’s starting to wear off a little bit. I’m not really, I don’t know if I’m feeling the 2026 file.

So, I’m going to figure out something new for this. I got to figure out a good background image here. Anyway, let’s get out of this.

And let’s go over to our wonderful Excel file. Our Excel file of hopes and dreams and wonders. Where we will attempt to give the people what they desire. Anyway, here we go.

Let’s put up our wonderful zoom it highlighting here. Our very highly accurate zoom it highlighting. Hi, Eric. Happy New Year. Happy New Year to you too. On my SQL Server production system, the top weights include CMEM threaded NUMA weights.

Accumulating a few hours of wait time. Your top weights have a few hours of wait time? There’s a lot of questions I have here.

Like, I mean, obviously, I mean, I don’t know. You’re looking to a server startup, but how long has your server been up where a few hours is concerning? You need to ask me about it.

Anyway, I couldn’t find much information online. That’s weird because I’ve posted about CMEM thread weights. That’s a weird one.

But I believe this may be related to parallelism wrong or bad maxed up wrong. What should I check next? Well, good question. So, trace lag 8048 can be helpful in reducing CMEM threaded NUMA weights.

Way back in SQL Server 2016, Microsoft had this series of blog posts called It Just Runs Faster.

And, you know, some of them were like, all right, cool, yeah, it does run faster. But other ones, like there was one about DVCC CheckDB running faster. And everyone was like, how’d they do it? And they were like, we skip a bunch of checks now.

And you’re like, oh, okay. Oh, that’s not cool. So, you know, how do you run faster? Well, I got in a car. Like, I took a bus and a train.

That’s why I ran faster. So, but there was one of these It Just Runs Faster posts was, so like back in the old, like pre-2016, if you had like more than eight CPUs or something, and you ran into this CMEM thread weight, there was a, I’ll talk about what CMEM thread is in a moment.

There’s a trace flag called 8048. You can enable it globally, like now on your server, but the most effective way of turning it on is when the server starts up, because there’s a whole bunch of memory stuff that is already sort of allocated and accumulated that 8048, to turning 8048 on in flight won’t help with.

So if you make it a startup trace flag, it affects way more things. But the whole, the reason, like CMEM thread is effectively like, like memory structures being, and like having like exclusive latches taken on them, and other things trying to take latches can’t get latches on things.

So CMEM thread is sort of like a memory safety weight. So it’s not really to do with parallelism. You’ll most, I see a lot of, there’s been a lot of stuff around it with sort of like the plan cache and query store.

And it’s essentially just like, you know, like, like just think like, like in the context of the plan cache, just think of it like, like something’s putting a plan in the cache and other things are trying to figure out if a plan is in the cache and stuff like that.

So it’s like the whole mess of latches. So you could try trace flag 8048 to see if that alleviates it, regardless of if you are on 2016 or newer.

The whole deal with 2016 was that like, like if you hit certain like thresholds, or you like reached a certain level of like contention, then SQL Server would try to sort of like on the fly, change the way that memory is partitioned from like a NUMA node to a CPU, so that there was less contention between like different things, different accessors of things in memory.

So like, like it kind of worked. But if you’ll just like keep like approaching the threshold or keep like approaching like the contention that it would worry about, then you never actually get there, then this like the it just runs faster apart doesn’t like kick in.

So what you really have to do is, or rather the easiest thing to do is just in your hopefully, which you’ll hopefully have access to SQL Server settings.

If you go into like the SQL Configuration Manager, add minus T8048 as a startup trace flag, reboot your SQL Server whenever you feel like, because, you know, who cares?

And you’ll be good from there, or you’ll be better from there. But, you know, I mean, C-Mem Thread and NUMA, like you really have to sort of be hitting quite a bit of that.

There were some spin locks at one point that made sense to look at, but I’ll be damned if I can remember those hieroglyphics off the top of my head. But, yeah, yeah, trace flag 8048 should straighten you out.

But, you know, what you’ve presented me with here are, you know, is this like a question that I can generally answer?

I can’t tell you if turning on trace flag 8048 and alleviating some or any of the C-Mem Thread weights that you’re currently happening, currently experiencing rather, will like massively improve your SQL Server performance experience.

It might make you feel better, right? It might be like, you know, like a CX packet thing where people are like, I changed MacStop to one.

CX packet’s gone. Oh, I can sleep at night. Like, I don’t know if it’s going to like make your life better. That’s that, you know, if you really want to get rid of it, just wanted to know some SQL Jeopardy about stuff, there you go.

But, I don’t know if it’s going to make your life better there. I don’t know if it’s really going to improve anything. Let’s see here. Oh, we got another happy new year. Oh, we got two happy new years.

Hi, Eric. Happy new year. On my SQL Server production system, I see that the VDI client Otherweight is at the top of the wait list. Is this connected to slow AGs? Can I ignore it safely?

I have never seen that thing matter. I can’t for the life of me figure out why that would matter. I think, like, I guess it could be like slow AGs. And that, like, if you, like, I remember maybe it being connected to, like, the initial direct seeding of databases.

So, like, again, go out, go really going back in time machine episode. 2016 introduced a feature for AGs called direct seeding, where, like, anytime you added a database to SQL Server, it would be like, hey, new database.

You’re going in the AG. Now we’re just going to take a backup single thread and throw you on the other server. And for big databases, that could not be fun. But, so, like, I guess maybe for, like, direct seeding could be related, but just, like, general AG activity, no.

I would ignore that one, unless you are actively direct seeding databases. And then maybe, maybe not. Okay.

Let’s see here. We increased max stop from 4 to 8, and the query stopped spilling to tempDB and actually ran faster. Isn’t more parallelism supposed to make things worse? I mean, all right.

So, it’s a fair question. Isn’t more parallelism, you know, for a single query, there is usually, like, a point where there is a limited return on investment for more parallelism, and certainly a point where there would be no more return on investment for increased parallelism.

The point at which you hit that in your max stop setting, or max stop hint for the query, rather, will vary from query to query. You know, if the more, the more rows your query processes per thread, probably the more it slows down, kind of.

So, you know, if you have more threads available to process rows and break up those units of work into smaller bits, you will, you will, you will, it will eventually hit a point where you find, like, the perfect top for every query, but, you know, more parallelism is not generally bad or worse.

You do have, I mean, more parallelism could be generally bad for, like, the workload. If you were just like, you know what, every query max stop zero, let’s just see what happens.

Like, that wouldn’t, that’s probably not a great idea. But, yeah, you know, these are the kind of things that I like experimenting with, because, you know, sometimes you’ll find, you know, a little sweet spot for max stop when you’re working on this stuff.

But, no, in general, more parallelism won’t make things worse. And you may find that in your case, the reason why you stopped spilling to 10 dB was maybe the memory grant was a little bit more effective per thread there.

Maybe you had some imbalanced parallelism where one or more of the threads were getting way more rows than the other threads. Because, of course, when SQL Server, every query plan starts off as a serial plan.

It’s only if your query meets the cost threshold for parallelism that parallel plans are explored. And, of course, if SQL Server finds a cheaper alternative parallel plan, then it will use that, right?

But what happens is the memory grant for a query is assigned to the serial plan. And the serial plan will split up that memory grant amongst your dot threads if a parallel alternative is chosen.

So, it is entirely possible that the reason you stopped spilling to 10 dB was because the row distribution across threads got better and the memory grant per thread was more effective.

So, that is most likely the explanation for why it stopped spilling to 10 dB. But to answer your second question, no, more parallelism doesn’t make things generally worse for a query.

You may find that it does not improve things as much as you would hope, but, like, past a certain point, like if you went from 8 to 16 or something.

But, you know, that’s probably good enough there. All right. We replaced a CTEwith a temp table and the query got faster.

Many such cases. So, CTEmust be inherently slower. Should we avoid CTE for performance? Well, I mean… It’s an interesting question for me.

I generally dislike CTE because too many people like them. That’s really…

But CTE in SQL Server have limitations or rather a lack of featurehood in them compared to other databases.

But let’s see. How far do I want to go on this? Let’s see. All right. We’re going to go all the way. So, yeah. So, generally, like if, you know, a CTE is not materialized in any way in SQL Server.

So, dumping data into a temp table, you get a materialized result set and the optimizer can make generally much better choices off a materialized result set than whatever query was in the CTE and sort of inlining that into the remainder of the query.

Especially if things get quite large and complex, then the optimizer tends to break down in all the usual places. So, you know, replacing a CTEwith a temp table can in many, many, many, many, many cases get you improved query performance.

But if the query performance is fine with it, then there’s no need to go tinkering with things. I used to think that it would be great if Microsoft would give us the option to materialize CTE.

But right now, the only facility available for materializing CTE would be a spool. And spools in SQL Server are a complete nightmare. So, you know, we don’t want to have any more spooling in our query plans than we currently do.

And so, what my thinking is that materialized CTE in SQL Server would be pretty crappy unless Microsoft were to update the code for spools beyond its current state, which hasn’t changed in SQL Server 7.

But of course, Microsoft is busy providing us with groundbreaking or just breaking products like Fabric. It actually hasn’t broken any ground, but sure has broke a lot.

So… Yeah. All right. Hmm.

Hmm. Hmm. We captured the execution plan when the query was fast and when it was slow. They’re identical. Since the plan didn’t change, this can’t be a query problem, right?

Well, I mean, you know, you’re… Let’s see. You’re maybe right that it’s not a query problem, but what it most likely is is a parameter sensitivity problem or if you want to think about it slightly differently, it’d be a data distribution problem.

All right. Because your query ran fast when it had to process, let’s say, this amount of rows and then…

Actually, let’s say this amount of rows. And then your query ran slow when you had to process this amount of rows. Now, this two could be literally two. Could be 200. Could be 2,000.

And this five, well, this five could be 50,000 or 500,000 or 5 million rows, right? So what you have is a parameter sensitivity issue. What you might want to try doing, just to see it, just to make sure…

Like, I’m not saying this is your long-term fix. Of course, there are many ways to fix a parameter sensitivity issue or a data distribution skew issue. But you could try running your query with a recompile hint and seeing if, like, the plan changes and seeing if some of those plans capture lots and lots of rows and some of them capture very few rows.

It could also not be that, right? It could also be, like, the query got blocked when it ran one time, right? Because if you’re looking at wall clock time and you’re like, oh, that query took, like, one millisecond and you’re like, oh, that query took 10 seconds, but the query actually only used, like, you know, that one millisecond of CPU time or something, right?

And something else got in the way, right? So it could be blocking. It could have been some other resource that was under contention at the time when it ran. You know, it could have been CPUs overloaded. It could have been memory grants overloaded.

It could be that the data was in memory the first time it ran and not in memory another time when it ran and it slowed down because you had to read a bunch of data from disk into memory. So, you know, the execution plan not changing.

I mean, you know, everyone loves a nice stable execution plan, right? Yeah. But, you know, it could be lots of stuff that was different at the time that the query ran when it was slow.

It is your job to figure out what was different, whether it was something within the query, like the number of rows that the query had to process, or something external to the query, like the query was blocked or the query was under some other type of resource contention.

So, there we have that. Anyway, that is, let’s just make sure here, one, two, three, four, five questions asked and answered.

That will wrap up this office hours. Wow, nearly 20 minutes. That’s a good one. That’s a good office hours. There was lots to say, I guess.

I hope I didn’t repeat myself too much. All right, let’s go. Thank you for watching. I hope you enjoyed yourselves. I hope you learned something tomorrow. And for the remainder of this week, we will be back to talking about the vector-y things.

We have some interesting new stuff to talk about. We have some new material. We will be doing some groundbreaking material, unlike Microsoft working on fabric.

So.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

Get AI-Ready With Erik: A Little More About Vector Types

Get AI-Ready With Erik: A Little More About Vector Types


Video Summary

In this video, I delve into the new vector types introduced in SQL Server 2025, focusing on their practical applications and limitations. I explore how these vector data types, particularly float32 and the preview-only float16, can be utilized for embedding models and similarity searches, while also highlighting the importance of consistent data type usage to avoid performance issues and bugs. The video covers key points such as the configuration settings required to access these features, the differences in precision between float32 and float16, and the current limitations on operations that can be performed with vector types. Additionally, I provide a detailed explanation of why Microsoft’s preview feature flag might be less than ideal for developers eager to experiment with new technologies.

Full Transcript

Erik Darling here with Darling Data. And today we’re going to look a little bit more closely at the new vector types in SQL Server 2025. This material and far more is all available in my new Get AI Ready with Erik course. The link you see up there will get you a hundred bucks off the course. That link is down in the video description. You could also pause the video, zoom in and type that whole thing out. But the main point is if you go to training.erikdarling.com and use the coupon code AIREADY, you can get the course for a little bit cheaper. But the vector data type, at least at present, is a vector of n, where n is the number of dimensions that will be generated by your embedding model. For right now, the generally available, fully supported for production usage is a float32. the data type will look like this. The data type will look like this. The 1024 just assumes that you’ll be using 1024 dimensions. It’s not like the limit or the only number you can use. You can generate. You can use a vector of any length, right? Just don’t mix them. There is a minimum of one dimension for float32 and there is a maximum of 1,998. Float32 is four bytes per dimension. Float, there is also a float16, but the float16 is preview only.

If it looks like this, you would have to put float16 as the data type precision there. Again, minimum of one dimension, but this one doubles the number of dimensions you can have from 1998 to 3,996. And this is only two bytes per dimension. Now, with anything, you know, always be specific and consistent about data types. The number of stupid bugs and performance issues and things breaking I’ve seen crop up over the years because of type sloppiness is bordering on absurd. Do not become a sloppy statistic. This is not just limited to vector types. This is limited. This is expanded, right? This is limitless across all data types. Even something as simple as matching Unicode and non-unicode, like making sure your strings are the correct type across everything, that your dates and date times and date time twos all match up correctly is very, very important.

So, Microsoft has this new database scope configuration called preview features where you can say, hey, SQL Server, I want to see the new stuff. Normally, these things would be hidden behind trace flags that you could turn on and off. And then you would get like a trace flag per feature. I don’t love this. And the reason I don’t love this is because when you do this, you get all the preview features. And if you’re using a preview feature and say another, like a cumulative update comes out and like your feature is not generally available yet, you have to leave preview features on and you might get a new feature that’s a preview feature that you don’t want.

And you’re like, no, this thing is messed me up, right? Like I personally think this is kind of annoying, a little scary, but who knows? Maybe Microsoft is committed to not adding preview stuff in. I’m not nuts about this, but if we want to see the float 16 vector type, we do have to use this. This also is necessary for a couple other things in the vector world, like the vector search function, right? We can use the vector distance function. That’s generally available. That’s everywhere in 2025, etc.

But the vector search and vector indexes are not. So we will have to use this one to see any of that stuff for later. But, you know, we can generate a float 32 with the maximum vector size, which for this is not going to look very interesting because I was lazy and I just generated the number 1.0 over and over again. You know, don’t, don’t hold, don’t, don’t, please don’t yell at me. It was just, I just wrote this to do it. I’m not meant to be anything interesting.

Or at least not meant to be anything like, you know, like an actual vector that you would see in the real world. But the, the float 16, actually something that I do want to bring up here is I’m going to leave this open and then I’m going to come back and we’re going to look at this for float 16. Now, keep in mind that I am generating the same 0.1 for all of these, right?

Except just different numbers of them because float 16, I can, I can generate way more of them. But when I run this, right, this will have the same number of bytes, but notice that these numbers are different. So the, the first one, 1.0 is stored pretty much exactly as 1.0, just as a, you know, 1.0 as a float.

For this, it’s 9.99. See, there’s a little negative here. That means it’s not 9.9. It’s actually like 0.9 something. If you were to convert this to like an actual, like the decimal or something, it would be a little bit more obvious. But for, you know, like for this one, like it’s, it’s going under, like it’s not 9.9. It’s like 0.9, whatever.

This is because when you, when you’re representing numbers with two date, two bytes versus four bytes, then you lose a little bit of precision. So like the value 0.1 can’t be stored exactly in binary, just like, you know, one third can’t be written exactly in decimal as 0.3 forever threes.

Float 16 uses fewer bits to store the number. So the approximation of 0.1 is just slightly less accurate than float 32. So, but like values of 0.5 or 0.25 would be, would be stored exactly in both formats. But for similarity search, right, for the cosine similarity search that we largely care about, right?

Like that’s the stuff that we, that we’re mostly going to be doing with our data. Those tiny differences don’t really affect results. Like I did go through an exercise with like, like going through the course with float 32 and float 16.

And there was like really like hardly any difference in any of the results. So like the loss of precision doesn’t really hurt cosine similarity search. So don’t worry too much about that. But again, float 16 is still a preview feature.

So you might not see that, you know, or you might, you might not ever use that in production for a long time. Cause who knows, like micro, like Microsoft has made like roadmaps for like management studio and for fabric. But you know, for like SQL Server stuff, they’re like, you’ll get it when you get it.

There’s no roadmap. You’ll just, you’ll just have to wait and see dummy. So, you know, real nice. Anyway, these things do have hard limits, right?

So if we try to go over, we will get errors and SQL Server will yell at us and say, you cannot, you cannot go past those limits. Diffent, diff-er, and float precisions.

So if we were to, so like another, like, like, like earlier, a couple of videos ago, I said that like when there was a float three and a float four, we couldn’t compare those using the vector distance function.

Same deal, at least at present with float 16 and float 32. So like, even if we have two vector types with the same number of dimensions in them, SQL Server is like, well, float 16 and float 32, you are just too different from me.

I can’t figure you out, right? Two bytes, four bytes, what are we doing here? So, you know, whatever you, whatever you choose, which will probably be float 32 because it’s generally available now.

If you decide you want to change to float 16 later, because it’s like float 32 is just like way too much storage space. Good luck.

Good luck. But there’s a lot of stuff you can’t do with the vector data types. And a lot of ways are very, very similar to like XML and JSON, where like, like, like there’s just like a lot of things that just don’t work, right?

Like you can’t like group by, order by, like do direct equality comparison, stuff like that. Like if I, like if I declare two vector types here, which match in every conceivable way.

And I just say, Hey, if you’re one is equal to the other, like just say yes. SQL Server is like, uh, huh? No, no can do. Sorry.

Like this just ain’t happening for you today. Right? So, Oh, you know what? That blew up both of these. Gosh, darn it. Knew that looked silly. But no comparisons, just like with XML, right? Like if you like, if you were like, try to do something similar to that with XML, like no, right?

Group by XML. What? Out of your mind. But you can just test to see if it’s null, right? So if we have two vectors and one of them is null and one of them has some numbers in it, SQL Server is like, yeah, fine.

We can figure out pretty easily if, you know, we’re allowed to do that. So that, that at least the null checks do work, but there’s a lot of stuff like query wise that you can’t do with them, which isn’t good, which isn’t like a big deal mostly, right?

Like you can’t do, you can like equal less than, greater than comparators. You can’t concatenate or like, you know, like add two vectors together. You can’t like sum or min or max a vector, which what would that look like anyway?

Again, you can’t group by or order by. And they have, and they also have very limited sort of constraint support, at least at present. I don’t know if any of this stuff might change. Like there’s some stuff that seems a little weird.

Like, like you are allowed to use null and not null constraints on them, right? You’re allowed to say if a vector column will allow no marks in it, but like you couldn’t have like a default value. So like if you just wanted to say, like throw like a dummy vector of like, like say you had a vector three column and you just wanted a default constraint of like 0.000 or something, right?

Just like three zeros. You can’t do that. You also can’t have any check constraints on them. I don’t know what you would check in a vector, but also not allowed.

You know, no primary keys, no foreign keys, no unique constraints, all sort of understandable stuff. You can use them in columnstore indexes. My dear, dear friend Nico had a blog post about, about that, I guess, a week or so ago.

But, you know, that’s, that’s, that’s in the course material. So I didn’t steal that from him. Probably never, never steal from Nico. I probably might steal, he had some nice hats.

I might steal one of his hats someday, but I don’t know, I feel bad about it. You can’t, you can’t have them in columnstore indexes. You can’t have them in, include columns for rowstore indexes. And you can, you can have more than one vector column in a table, which sounds like a nice sort of like, cheat for something maybe.

But what’s, but like, like one thing that I want, like I always love figuring out when these new things come out is sort of like, like how much like feature coverage do they have for other things?

So like right now we have the vector distance function in SQL Server, where we can see how similar, or depending on like what metric we’re using, we can use cosine, you could dot, whatever.

But like, you know, for us mostly cosine. So like if we wanted to have two vector columns and a table and figure out how similar those are in like with a computed column, the vector distance function is at least now not deterministic.

So we couldn’t persist that computed column. We can index that computed column. The vector search, vector, vector distance function would have to run for that every time we looked at it. So it’s like, it’s kind of annoying, right?

Like, like, I don’t know who wrote the spec on this thing, but they, I don’t know, maybe, maybe didn’t think a lot of stuff through. So, but that’s sort of typical of SQL Server these days. Anyway, that was a little bit more about the vector data type, sort of some of its limitations, what you can and can’t do with it.

A little bit about float 32 versus float 16. Again, float 32 is generally available and float 16 is preview only at this point. And of course my trepidations about the free, the preview features flag.

So got some stuff there. Anyway, thank you for watching. Hope you enjoyed yourselves. Hope you learned something. Hope you buy the course and I will see you over in, well, today’s, today’s Friday, isn’t it?

Yeah. So I will see you Monday for office hours. Look at us go. All right. We’re really back in the swing of things here. All right. Goodbye.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

Get AI-Ready With Erik: A Little About The StackOverflow Demo Database

Get AI-Ready With Erik: A Little About The StackOverflow Demo Database


Video Summary

In this video, I dive into the demo database used for my new course, Get AI Ready with Erik, which focuses on vector data and search strategies in SQL Server 2025, Azure SQL Database, and Azure Managed Instance. I explain why I chose to use Stack Overflow 2010 instead of the larger Stack Overflow 2013 database, highlighting its smaller initial size (about 1 gig download and 10 gigs in database size) that grows significantly after generating embeddings and other vector data types, reaching around 40 gigs. Throughout the video, I discuss various aspects of the database, such as the benefits and challenges of using different columns for embedding generation, like question titles and bodies, and how to handle text length limits by chunking longer texts to ensure meaningful context is maintained across embeddings.

Full Transcript

All right, Erik Darling here with Darling Data. And we’re going to, like I said, spend a little time in January talking about my new course, Get AI Ready with Erik, which is all about dealing with the vector stuff in SQL Server 2025, Azure SQL Database and Azure Managed Instance. Today, we’re going to talk a little bit about the demo database that I’m using here. Most of my training uses is a version of the Stack Overflow database called Stack Overflow 2013. But for this course, I started with a smaller one because this course isn’t necessarily about like performance tuning in all its glory, where a bigger database presents more performance problems. This is just about, you know, getting started with the vector stuff, learning how search works, you know, generating embeddings, chunking, all the stuff like that, various different search strategies. But for this course, I’m using Stack Overflow 2010. because it’s a lot smaller to start with. It’s about like a 1 gig download and 10 gigs in database size. But once we generate all the embeddings and stuff, it ends up being around 40 gigs, right? So this thing gets really big compared to, you know, where it started. If I did this with the Stack Overflow 2013 database, which is about 50 gigs, it may well ended up between 100 and 150 gigs once all the embeddings got generated. So let’s talk about a few things that make for good embeddings and Stack Overflow. In the posts table, we have things like the question title, which are great, right? Because it’s short, it’s compact, it’s easy to fit the entire thing into one embedding. Then we have the question body, which is also good, but present some challenges because we have the question and answer bodies, which are good, but they present some challenges because they tend to be longer text. And we may need to think about chunking longer ones up because it if we don’t, they might get like they would get silently truncated after a certain number of tokens. We’ll talk about all this more as we go through things. But just so you kind of, you know, get an understanding of like, like where we’re going with the course. You know, I’m going to jump ahead a little bit in the things that I say, but just sort of make a, if there’s anything you don’t understand, just make a mental note, write it down. I promise you, something will come up later where we’ll talk about it. Of course, this is all fully covered in the course material that I have.

And then we have the tags column, which is not really good for generating embeddings on because we’re probably not going to be looking for like similar tags for a lot of stuff. But the tags do make for a good sort of pre-filtering elements to sort of guarantee that something we’re getting is actually what we’re like looking for. So, but again, stuff we’ll talk about as we go through. In the users table, there are some things that, you know, like a mixed bag here. The about me column in the users table would be good for generating embeddings because it would kind of help you, it might help you find users who would be good at answering certain questions based on what they’ve typed in their little biography field. For location, this one you could talk me into because in Stack Overflow land, location is not like a dropdown. It is a free text entry form and people can write in all different things.

Like if we were to, like, if we wanted to find a bunch of people in New York, you would have to, we could like spell out New York. We could have NY. We could have like, you know, New York, New York. We could have like Brooklyn, Manhattan, Queens, Staten Island, like the Bronx, right? Like all these things, all these different things that could indicate New York. But don’t necessarily like, aren’t just like the words New York that’s written in. So maybe location would be kind of cool. Website, unless you were looking for like, I don’t even know, like, like unless you were looking for like someone spamming like some domain in a website with like, I don’t know, like an online casino or something. But you could, you know, so probably not because you could already find that pretty easily. In the comments table, there is a text column with a comment in it, but we don’t read the comments here.

So probably not that, you know, it would be, it would be like, like trying to find similar user reviews or something, right? I mean, I don’t know, maybe like you could find like comment spam with that. But comment spam tends to be pretty copy and paste, like, hey, join my Discord server. Win a chance to win 3 million bitcoins or something. But like we would have, like the way we want to look at this is that, you know, like, like what, like what we’re going to use generally in the post table to sort of like link different things together.

So like starting with the question title and then finding like other similar questions by title or starting with question title and then finding like what might be another good answer, like, like, like to the, to the question, but maybe like that’s an answer to a different question. Right. So like, like all sorts of different things. If we look at sort of the breakdown of questions and answers in the post table, we get back, let’s zoom in on this a bit. We have about 2.6 million answers. We have about 1 million questions and we have a thousand others.

What the others are doesn’t matter. The things no one cares about in Stack Overflow. The titles that we get in Stack Overflow look like this and they’re generally pretty short, right? People tend to write pretty short, pretty descriptive titles for their questions to sort of like, you know, get the attention of people who could answer questions.

Right. Like people, like people doing their similarity search for like things I know about. Right. Like how do one do the most recent commits and get like, oh, I know how to do that. I can answer that question. Right. So like, you know, the human brain pattern match.

So this is kind of like what stuff generally looks like. And when I said that titles are generally short, it’s because people don’t want to write their life story and their whole question in the title. But questions can also, question titles can also be somewhat vague, but still helpful. Right.

Because like, like if you look at like the DBA Stack Exchange site, there’s going to be a ton of questions. It’s like how to optimize this SQL query. Right. Is this like over and over again? You see that like, like copy and paste it in.

And then like the actual details and the useful stuff is going to be in the question body. So there’s, you know, like the question titles are great because they tend to be short. And it’s very easy to generate fully self-contained embeddings based on short text like that.

In the body column, like this stuff tends to get a bit longer. Right. So if we look at this is taking forever. Why is this so slow? We’re just getting that by score descending. Jeez Louise.

Bodies. There’s some interesting stuff to deal with in the bodies that can, you know, influence some of your choices. We’ll talk about later. Like one of the really annoying things is in the body column. Like, like Stack Overflow stores all the HTML and that HTML like tends to count towards like the number of tokens and stuff.

It’s, it’s aggravating, but it’s the best that we’ve got right here. I didn’t want to write a function to clean HTML out of Stack Overflow questions. I’m sorry. I guess I could have had AI do it.

But this is what bodies tend to look like. Now, if we look at the sort of distribution of title lengths in the post table, we’ll see that, you know, a lot of them are on the very short side. So between like, you know, like a hundred and like, I actually don’t know what the minimum is for number of characters that a question title has to have.

So let’s just say between one and a hundred characters, right? Like the majority of the questions that we have fall into that. Most of them are between 30 and 59 and a few of them are on the longer side up here.

But there’s, but there’s still short enough to be pretty like self-contained as far as like generating one single, it’s like self-contained embedding without having to think about breaking your embedding up or breaking your text up over multiple embeddings. Right.

So if we look at body lengths, right, and we look at this stuff here, run the same kind of query, you know, you know, like a lot of them are like less than 500 bytes, but a lot of them, you know, they tend to get pretty big. And this is where you have to start sort of worrying about text length limits for your embedding model.

The reason why this matters is because embedding models convert text to what’s called tokens. If you’ve ever used like an LLM sort of like a, maybe like a command line thing, like cloud code or cursor or something like that, you’ll notice it like, like when you put in things like there’s this little counter says tokens, it like racks up pretty quickly, like you’re trying to get a high score.

But, you know, all those embedding models convert text to tokens. You get about four characters per token. That’s why I was getting kind of like annoyed of like the HTML storage in there. Different embedding models will have different token limits, right?

And those token limits will define how long, like how much text will go into generating and embedding. So if your embedding model has a token of say, has a limit of say 512 tokens, the text beyond that 512 tokens will get like it just gets silently thrown away, right?

And 512 tokens is maybe around 400 words or 1600 characters. A lot of question bodies exceed that limit. But this is where the concept of chunking comes in, right?

Where you take long bodies, right? You know, chunk your long bodies. You take something that’s like, let’s say 1,024, I don’t know, let’s say something that would take 1,024 tokens.

You break it up into two things and you generate your embeddings over both. And those things can have a percentage of overlap to maintain context across. So like, let’s say that, you know, you had a short paragraph first and then a longer paragraph.

You would like want to have some overlap so that like maybe like the second chunk wasn’t like the first half of the paragraph and then the second half of the paragraph. That would just like you would lose some context and meaning if you did that.

So, but the whole point of chunking with the overlap is to maintain some context across longer texts. Anyway, that’s about enough for this. Thank you for watching.

I hope you enjoyed yourselves and I’ll see you over in tomorrow’s video where we will have some additional vector data type things to talk about. Anyway, see you tomorrow.

Goodbye. Goodbye.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

Announcing Free SQL Server Performance Monitoring That Doesn’t Suck

Announcing Free SQL Server Performance Monitoring


For years, enterprise SQL Server monitoring tools have been charging you an arm and a leg for stagnating products made by people who have never done an ounce of SQL Server performance tuning or DBA work. They phone home with your data, lock you into annual contracts, and still can’t tell you why your server is slow without you doing half the work yourself.

That is, if you survive the installation process. Active Directory. Domain accounts. Firewalls. Security policies. Additional database servers. Annoying.

Or just send everything to the cloud, somewhere.

I got real tired of dealing with that, so I built a free, open-source SQL Server performance monitoring tool that isn’t any of those things, and doesn’t suck.

You can find it on GitHub.

SQL Server Performance Monitor — Dashboard landing page with server health cardsFree SQL Server Performance Monitoring
NOC-out

Here’s what it does, and why you might want to check it out.

The Short Version


SQL Server Performance Monitor is a free tool that collects and visualizes performance data from your SQL Servers.

It comes in two editions:

  • Full Edition — installs a PerformanceMonitor database on each server with T-SQL collectors running via SQL Agent. A separate WPF Dashboard connects to view everything in that database, and that database only. It can’t see anything beyond that.
  • Lite Edition — a single desktop app that monitors SQL Servers remotely. Nothing gets installed on the target server. Data gets pulled into a local embedded DuckDB, with automatic Parquet archival.

At no point does your monitoring data go anywhere outside of your server or your computer, and the connection method is the same one SSMS uses.

Both editions include real-time alerts (system tray & email), pretty charts and graphs, and a built-in MCP server for LLM-powered analysis with tools like Claude.

Supported platforms:

  • SQL Server 2016, 2017, 2019, 2022, 2025
  • Azure SQL Managed Instance
  • AWS RDS for SQL Server
  • Azure SQL Database (Lite only)

Download on GitHub

Who Is This For?


Not everyone needs enterprise-scale monitoring. If any of these sound familiar, this tool is for you:

  • Solo DBAs and small teams: You’re responsible for a handful of SQL Servers. You need to know what’s happening, but you don’t need a platform that costs more than the servers it monitors and only tells you when PLE is low and how many KB of free space your disks have.
  • Consultants and contractors: Drop it on a client’s server, collect real data, diagnose their problems without having to spend time on data collection. The Lite Edition doesn’t even require installation on the target, just connect and go.
  • Developers who care: Your app is slow and you suspect the database (as everyone does). Point this at your server, run your workload, and see exactly which queries are hurting your server.

Two Editions, One Goal


Full Edition (Server-Installed)

Full Dashboard with wait stats, CPU trends, and blocking chainsFree SQL Server Performance Monitoring
Actual Colors May Vary

The Full Edition installs a PerformanceMonitor database directly on each SQL Server you want to monitor. T-SQL collector stored procedures run via SQL Agent, gathering everything from wait stats and query performance to blocking chains and deadlock graphs and more.

What you get:

  • 30+ specialized T-SQL collectors running every 5 minutes (you can control the schedule, and which ones run)
  • Data stored in SQL Server tables you control (query them directly in SSMS or whatever)
  • Community tools installed automatically (sp_WhoIsActive, sp_BlitzLock, sp_HealthParser)
  • NOC-style landing page with green/yellow/red health cards per server
  • Automatic data retention (30 days default, configurable per table)
  • Built-in MCP server with read-only tools for AI analysis (disabled by default)

Best for: Production servers where you want continuous 24/7 collection with maximum data fidelity and access.

Requirements: sa on the target server for database install, SQL Server Agent running. Collectors only need VIEW SERVER STATE or VIEW PERFORMANCE STATE.

Lite Edition (Standalone Desktop App)

Lite Edition main window with sidebar and query performance chartsFree SQL Server Performance Monitoring
Lite Brite

The Lite Edition is a single WPF application that monitors SQL Servers remotely. It queries DMVs directly over the network and stores data locally in embedded DuckDB with automatic Parquet archival. Nothing is installed on the target server.

What you get:

  • 20+ collectors with configurable schedules
  • Local DuckDB storage with ZSTD-compressed Parquet archives (~10x compression)
  • Multi-server monitoring from one seat
  • Azure SQL Database support (the only edition that supports it)
  • Single executable — extract, run, connect
  • Built-in MCP server with 31 read-only tools for AI analysis

Best for: Quick triage, Azure SQL Database, locked-down servers where you can’t install anything, consultants, firefighting.

Requirements: VIEW SERVER STATE on the target server (VIEW DATABASE STATE for Query Store), or just VIEW PERFORMANCE STATE where supported.

Side-by-Side Comparison


Feature Full Edition Lite Edition
Installation on SQL Server Required (database + jobs) Not required
Collectors 30+ (via SQL Agent) 20+ (in-app)
Data Storage SQL Server (on target) DuckDB + Parquet (local)
Azure SQL Database Not supported Supported
Alerts (tray + email) Blocking, deadlocks, CPU Blocking, deadlocks, CPU
Agent Job Monitoring Duration vs avg/p95 Duration vs avg/p95
MCP Server (AI) Yes Yes
Community Tools sp_WhoIsActive, sp_BlitzLock Not needed
Execution Plans Collected, downloadable Collected, downloadable
Portability Server-bound Single executable

Alerts That Actually Tell You Something


Alert email notification showing blocking details with query textFree SQL Server Performance Monitoring
You oughta know.

Both editions include a real-time alert engine that runs independently of which tab is visible, even when the app is minimized to the system tray.

What gets monitored:

  • Blocking: fires when the longest blocked session exceeds your threshold
  • Deadlocks: fires when new deadlocks are detected
  • High CPU: fires when total CPU exceeds your threshold
  • Connection changes: fires when a monitored server goes offline or comes back online

How you get notified:

  • System tray balloons with a 5-minute per-metric cooldown
  • SMTP email with styled HTML, query text, blocking chains, and deadlock graph XML attached for offline analysis
  • Resolved alerts: automatic “Cleared” notification when the condition ends

All thresholds are configurable. You can acknowledge and silence alerts per-server and per-tab with a right-click.

MCP Server: AI-Powered Analysis


Claude analyzing SQL Server performance data via the built-in MCP serverFree SQL Server Performance Monitoring
Thanks for the summaries

Both editions include a built-in Model Context Protocol (MCP) server. Enable it in Settings, register it with Claude Code or Cursor (but definitely not Co-Pilot).

The MCP server only works off of a set of read only tools, and can only look at your collected performance data.

They can’t see anything in any other databases, or execute custom SQL commands.

You can ask questions like:

  • “What are the top wait types on sql2022?”
  • “Are there any blocking or deadlock issues?”
  • “Show me CPU utilization for the last 4 hours”
  • “What are the most expensive queries by CPU?”
  • “Give me a full health check across all servers”

They cover wait stats, query performance, CPU, memory, blocking, deadlocks, I/O latency, TempDB, perfmon counters, and more.

Setup is one command:

claude mcp add --transport http --scope user sql-monitor http://localhost:5151/

Your Data, Your Tables


With the Full Edition, everything lands in SQL Server tables you own. No APIs. No export wizards. Open SSMS and write a query.

-- What were my top waits yesterday?
SELECT TOP (10)
    wait_type,
    waiting_tasks_count,
    wait_time_ms,
    avg_wait_ms
FROM PerformanceMonitor.collect.wait_stats
WHERE collection_time >= DATEADD(DAY, -1, GETDATE())
ORDER BY 
    wait_time_ms DESC;

Reporting views are included for common questions, but you’re not limited to them. Join the collection tables however you want.

Build Power BI reports?

Export to Excel?

Something with Fabric?

I don’t care.

It’s just tables.

What Gets Collected


The Full Edition runs 30+ collectors via SQL Agent. The Lite Edition runs 20 collectors directly in the app. Here’s what they cover:

Category What’s Collected
Query Performance Query stats from plan cache, procedure stats, Query Store metrics, active query snapshots, waiting tasks
Wait & Resource Stats Wait statistics (delta-based), latch statistics, spinlock statistics, CPU utilization, perfmon counters
Memory & I/O Memory grants, memory clerks, buffer pool pressure, file I/O latency, TempDB usage
Blocking & Events Blocked process reports, deadlock graphs, ring buffer events, default trace analysis, system health events

All collectors compute deltas automatically — you see actual work done between snapshots, not cumulative totals that reset on service restart.

Security: We Care A Lot


You’re giving a monitoring tool access to your SQL Servers (I hope, anyway). I did my best to make sure we’re not doing or allowing anything stupid.

What it does:

  • Stores passwords in Windows Credential Manager (DPAPI encryption), never in config files or plain text. Same as SSMS.
  • SQL connections default to mandatory TLS with certificate validation
  • Every SQL query uses parameterized queries, zero string concatenation
  • All monitoring queries run at READ UNCOMMITTED to avoid blocking your production workload
  • Every line of code is open source on GitHub for you to audit, make fun of, steal, fall in love with, etc.

What it doesn’t do:

  • No telemetry: zero analytics, tracking, or usage reporting
  • No phone-home: the app never contacts any external server
  • No cloud dependency: everything stays on your machines and your SQL Servers
  • No auto-updates: you choose when to update, and if you want to check for updates
  • No hidden network calls: the only outbound traffic is SMTP email alerts you configure yourself

Getting Started


Full Edition

  1. Download the installer from GitHub Releases
  2. Run the GUI or CLI Installer and enter your server connection details
  3. Click Install, it creates the PerformanceMonitor database and SQL Agent jobs
  4. Repeat for each server you want to monitor
  5. Launch the Dashboard, add your servers, and data appears immediately

Lite Edition

  1. Download and extract the Lite Edition ZIP from GitHub Releases
  2. Run PerformanceMonitorLite.exe
  3. Click “Add Server” and enter your connection details
  4. That’s it. Data collection starts within 1-5 minutes.

Why I Built This


I spend a lot of time looking at SQL Server performance problems. Clients call me when things are on fire, and the first thing I need is data. What’s waiting, what’s blocking, what’s eating CPU, which queries are the worst offenders, and a bunch of other contextual information to go along with it.

If they don’t have a monitoring tool, we’re at the mercy of aggregated metrics, restarts, and maybe just not looking at the right thing at exactly the right time. It’s also really hard to show improvements over time.

Most monitoring tools are either expensive enterprise platforms, or one-off scripts that don’t persist anything (and certainly don’t give you any pretty graphs to look at).

I wanted something in between: a real monitoring tool that collects meaningful data, shows it clearly, and doesn’t require a procurement process to install.

So I built one. And now you can use it too.

Support & Professional Services


This project is free and open source. Use it freely. No strings attached.

If you find it valuable, here are two optional ways to support continued development:

  • Sponsor on GitHub — fund new features, ongoing maintenance, and SQL Server version support
  • Consulting & Training — need help analyzing the data this tool collects? Want expert assistance fixing the issues it uncovers? That’s what I do.

Download SQL Server Performance Monitor

MIT License | .NET 8 | Source on GitHub

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

Get AI-Ready With Erik: A Little About Embeddings

Get AI-Ready With Erik: A Little About Embeddings


Video Summary

In this video, I dive into the world of embeddings and their importance in bridging the gap between what computers can do and what humans need them to do. I explain why traditional string comparisons fall short when it comes to understanding the meaning behind text, especially for tasks like optimizing SQL queries or improving database performance. By converting text into numerical vectors, embeddings allow us to compare texts based on their semantic similarity rather than just surface-level differences, making it possible for computers to understand and process natural language more effectively.

Full Transcript

Erik Darling here. Back in my AI groove. Today we’re going to talk a little bit about embeddings. Yesterday we talked a little bit about dimensions and how embeddings make up dimensions and stuff like that. So today we’re going to talk about why we have embeddings. What is an embedding and why do we need it? So that’s going to be our goal today is to just go over that a little bit. Now, the thing is that computers are great. at comparing things like numbers, right? Which is what embeddings are made up of all those little floaty things that we looked at. But computers are very, very bad at understanding meaning, right? And unless we get into all sorts of terrible, like, like wild card percent, like searching in our, our string columns, which we would like to not do, then like there is really no good way to compare text to see how similar that text is, right? We all we can do is to compare text to see how similar that text is, right? We all we can do is to compare text to see how similar that text is, right? So what we can do is say is this word in this text, or are these words, like is this pattern in this text, even regex can’t, can’t tell you what something means. Unless we don’t want that problem. So if we were to take the phrases, how do I optimize a SQL query? And what’s the best way to speed up my database queries, if we were to take those strings, the computer would not be able to really compare those strings in a meaningful way for like what their what their intent is, what they mean to us, right? All the computer can do is compare the little bits and bobs in each string and say, hey, do they match? Are they are they are they equal? So to a computer doing string comparison? You know, those are completely different strings, right? Those strings are not equal to each other. But to a human, they’re kind of asking the same question. Embeddings are to try and bridge that gap. Now, you might hate AI, right? You like I mean, I’m, you know, pretty, gosh, diggity darn sick of hearing about AI everything and having everyone shove an AI thing in their, their AI in their in their application or in their product and saying, Oh, look, I changed this page from loading to thinking and now I’ve got AI. But like, if like, as a DBA, you should be into this or like, even as a developer, you should be into this because like, you can avoid doing all that, like really painful, like wild card string search stuff in a lot of places.

So like, if we were to take these two strings, right? Again, how do I optimize a SQL database query? And what’s the best way to speed up my database queries? The computer can’t figure that out, right? Like it just says that these things are different, right? These are not the same string dummy, right? But they mean the same thing, right? They have the same general meaning. Even if you were to do like, you know, some awful like search, you know, with a bunch of wild cards in there, which SQL is awful at, or even if you get into like full text search, these strings are not at all equivalent, right? There’s no match, right?

It’s terrible. So what embeddings do is turn text into numbers. So how do I optimize a SQL query? Well, I mean, you know, just to jump ahead a little, if your embedding model generates 1024 dimensions, then you’re embedding, then that string will become 1024 numbers, right? It’ll be that sort of square, at least in SQL Server and how it shows you these things as a little square bracketed list of various floats and bibs and bobs. And what’s the best way to speed up database queries would also become 1024 numbers. And those 1024 numbers, because the way that the embedding models are all trained, and like the billions and billions of texts that they’ve seen, these two things generate numbers that are pretty close to each other, they’re not going to be exact.

But again, we’re not trying to find exact here. It’s not like if you were searching for an error code, like I’m hitting error 0x, you know, 80085. And you want an exact match for that error, because if you get like error 80086, it’s going to be different, right? It’s not the same error. So like, it’s not an exact match we’re looking for. We’re looking for things that are similar to each other.

So if we were to compare, like SQL Server, like, like the embedding model would take these two strings, turn them into numbers, and then those numbers would become how we figure out how similar they are. And again, the lower number you get using the cosine thing with vector distance, the better off you are. So this is a very, very similar one, because this number is getting pretty close to zero, right?

Like, so this would be a pretty good indication that these strings are similar. If like, you know, if this, if this five are way over here, or like, like, you know, we were like, you know, had like a one point something over here, that would not be very similar, because cosine goes from zero to two. Right? So we would not have a very similar match there.

Like, if we took these two strings, like, let’s say we had these two questions. We had, who is the governor of Campania during the Herculonius period? And how do I get pizza off my eyeglasses?

Completely different questions, right? Not even closer, like, like, we were in different worlds. So these would generate completely different embeddings. And these completely different embeddings would generate a cosine that is, you know, that is nearly two, that is nearly, that is almost as high as we can get, as distant as we can get with two, with two sort of text embedding vectors, right?

It’s very, very far apart in the world of cosines. But the whole thing is that each dimension is there to capture some aspect of meaning. What does, what does, what does this string of words mean?

What are we indicating here? What is our intent? Right? And at least nobody I know knows exactly what each dimension represents. That number, like, got me on that one, right?

Search me, baby. But I’m sure there’s, I’m sure someone out there is very, who is very, very smart and very, very good at math can tell you what each, they could probably read dimensions the way, you know, some people can read, you know. Like, like, assembly code.

Like, okay, good for you. You know, some people can read, like, binary. Like, oh, like, that’s, that’s a 17. And you’re like, okay.

Okay, sure it is. But I’m sure someone, someone out there who’s very smart could, you know, explain to you what each dimension represents. But all of these different AI models, right? All of these models learn patterns from billions and billions of text examples, right?

It could be, you know, copyrighted internet material. It could be books they stole. It could be your website, right?

It could be your blog posts. It could be, I don’t know, like, anything, right? Like, all these text examples, these models get trained on them. You know, it’s sort of like, like, OCR stuff where, you know, the, like, different programs will get trained on different sets of images so that they can recognize something from those images. Like, you know, birds or bowling balls or, you know, Adidas t-shirts.

So it’s sort of like the same thing, but like, but just with text. So it’s sort of like, like GPS coordinates for meaning, right? So like New York and Boston, they’re pretty close, right?

They’re nearby each other. New York and Tokyo, they’re, they’re pretty far from each other. I think that’s like almost opposite, exact opposite sides of the planet. So, you know, for like, if you were to take the, the, the, like SQL optimization and query tuning, you know, those are pretty close by, those are close by concepts, right?

They’re very, they’re very near. They are near neighbors to each other. But SQL optimization and like best pizza are pretty far.

Even though the best SQL optimization and the best pizza are in New York, right? That’s, that’s, that’s me. I’m not the best pizza, but.

Ah, whatever. Anyway. Ah, that was off. That was a little off the cuff. Oh, I apologize. I apologize. Ah, but those things are far apart, right? Like, you know, like if you, if you, if you’re trying to find like, you know, like query tuners in New York and you get a list of pizza places, well, you’re probably not going to be too happy with that search.

Anyway, it’s a little bit about embeddings. Um, hope you enjoyed yourselves. I hope you learned something.

And, uh, I will see you over in tomorrow’s video where, uh, we will talk about more stuff from the course. And again, this course is all currently, uh, on sale. You get a hundred bucks off with that coupon code.

This link will be down in the video description. Uh, but if you don’t feel like waiting from clicking through on things and you feel like there might be some malicious, uh, UTM codes or something in here, you can just go to training.erikdarling.com. So, grab that Get I Ready course and just use the coupon code AIREADY to get that hundred bucks off.

Anyway, thank you for watching. I’ll see you over in tomorrow’s video where we will talk about some other vector-y things. All right.

Thank you for watching.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

Get AI-Ready With Erik: A Little About Vectors and Distances

Get AI-Ready With Erik: A Little About Vectors and Distances


Video Summary

In this video, I dive into the exciting new vector data type and vector distance function introduced in Microsoft SQL Server 2025 and Azure SQL Database. Starting off with a brief introduction to the vector data type, I explain its syntax and default behavior, including how it handles dimensions and the importance of specifying the correct number of dimensions when working with vectors. I also highlight that while this is just a taste of what’s covered in my comprehensive course, purchasing it remains essential for those looking to fully leverage these new features. The video then delves into practical examples using SQL Server Management Studio 2025 and demonstrates how vector distance functions like cosine, Euclidean, and dot work, emphasizing their importance in similarity searches and text analysis tasks.

Full Transcript

Erik Darling here. Darling Data. And we’re going to spend some time, since I just released a course about all the new AI hubbub in Microsoft SQL Server 2025, Azure SQL Database and Managed Instance, I figured we should do some videos to kind of give you a little idea of what the content looks like in order to hopefully spurn you into purchasing it. There’s a coupon code up yonder that will get you $100 off the course. I’ll put this link in the video description for as many as I remember. But the coupon code is just AI ready. So if you go over to my training site at training.erikdarling.com, you can put in that coupon code without having this whole full link in front of you in order to get that course content. Anyway, let’s see. Let’s start off with just a little tiny itty bitty introduction to the vector data type and a little bit about the vector distance function. There’s of course a lot more material in the full course, but if I did the whole thing for free here, there’d be no point in purchasing it, wouldn’t there? Unless you just wanted to say thanks, which the very few of you want to do. Anyway, the new vector data type, by default, well, I mean, it looks like this. It’s like Victor with an E. You say, I want this vector data type. You tell SQL Server how many dimensions will be in that data type. And of course, dimensions are these things over here. Each dimension is a number. They’re all separated by commas, right? So 1.0, 2.0, 3.0, and they have to be within these little square brackets. By default, they will all be float 32. We’ll talk more about what that means. There is of course a float 16 that is in preview. Float 32 is the general availability data type that is fully supported currently.

And so that’s generally what we get. You generally won’t have three dimensions in a vector. It’s just for a little bit of simplicity here. But this is a little bit of, oh, this is kind of what it looks like. Let me use my fabulous new SQL Server Management Studio 2025 content zoom feature. And this is what you get back when you look at a vector in SSMS. It’s actually kind of a funny XML clickable column. And notice that these numbers look a little bit different from the numbers that we put in. These get converted to big crazy floats. But this is what the vector data type looks like.

Sort of. Well, I mean, it’s not what it looks like in storage. It’s what it looks like presented to you as a person. Sort of like on Star Trek when aliens are like so big and weird and scary that they have to wear devices that present themselves as like human looking to the ship members and the crew and stuff. It’s sort of like that. If you saw what vectors look like, if you saw their internal representation, your mind would collapse on itself.

All right. It’s like hearing the voice of God or something. One thing to be aware of, and this is something that’s going to scare some people when they maybe start working with these, is that vector data types are rather inflexible in many ways. You have to know exactly how many dimensions. And again, a dimension is each number inside here. You are going to have for your vector data type and you must use precisely that many.

For example, if we do this and we declare a vector with one dimension, we cannot assign three dimensions to it and just have it silently truncate. We’ll get an error. Nor can we sort of under pack a dimension and say like, you know, a lot of people will be like, I’m going to just use Envarcar max for this state field and stuff like, you know, M A N Y R I C A T X and stuff in it.

You can’t do that with vectors, right? So if we say I want a vector one oh two four and we try to just put three in it, SQL service like no can do. One oh two, four and three do not match. Sorry, pal. The demand, the number of dimensions that you use for your vector data, vector data types are going to be determined by whatever embedding model you choose to use to generate your embeddings.

That’s a big, scary, crazy sentence. But don’t worry. We’ll talk more about that is, you know, sort of another videos this month. But also I talk way more about that in the course material.

So if you are just so eager to know more about that, you can you can get way ahead and buy the course now. But different models. Right. So like, you know, like every time, you know, like Anthropic or ChatGPT or, you know, whatever Google is doing, say this is our newest, most powerful model yet. That like, you know, like, you know, they’re talking about like things like that.

Right. It’s like the model that you’re using that would generate the numbers that tell that, like assign meaning to the text data that you have. Right. Those models generate stuff and like in those numbers, those models generate a certain number of dimensions. There are some newer ones that are dynamic, but like that’s not going to help us here.

But like most like most like you choose, like the model that you choose. Right. Like is going to be dependent on a lot of stuff. You might not even be the one choosing it.

You just have to know how many dimensions that model creates when it generates embeddings. But once you pick a model, right, or once a model is chosen for you, all your vectors need to match that dimension. So this is called length and you cannot compare different ones.

So the like probably one of the more common ways that you’re going to be using vector data types in SQL Server, at least today, because the vector search function, which will again, we’ll get to later, is still in preview. It is not a generally available feature. So the most common way that most people in production are going to be doing things is by using the vector distance function.

That’s this thing right here where this is how this is how SQL Server will tell you how similar things are usually using the cosine calculation here. But if we have, let’s say up here, we have a vector three and a vector four, we cannot compare those. Right. Like SQL Server just says vector dimensions three and four do not match.

Right. So we can’t compare a vector with three dimensions to a vector with four dimensions or any other differing number of inventions or dimensions. Doesn’t matter. But the vector distance function has three metrics available.

We have cosine, Euclidean and dot. Cosine is what you’re going to be using most of the time because that’s a similarity search. Euclidean is almost like GPS coordinates.

It’s it’s a totally different thing. And dot is dot is just weird. I don’t I don’t even like talking about dot because it’s just that bizarre. But if we have these two vectors up here and we’re just going to again keeping things simple.

With just three dimensions and then we have one dot O two dot O three dot O and one dot O two dot O three dot five. What we what we get when we run these things are slight are just different ways to measure similarity. Right. So.

It’s different, but for similarity search. Right. For like searching text for like, you know, things that are considered close to each other. Cosine is going to be what you use mostly.

So let’s run this and compare. Let’s see how similar these two vectors are and the cosine distance for this one. Like generally lower number is better.

Right. So the lower a number is the quote, like the better off it is for the cosine distance. This is measuring like the similarity of like like all of the dimensions that we have up there. The Euclidean distance is 0.5.

And the only reason this is 0.5 is because this is 0.5 higher. So when I said this is like GPS coordinates or something, it really is. And this isn’t going to be really good for like text similarity search stuff.

And this is not what you want to use here. So just because that that dimension is like 0.5 somethings longer than the other dimension, it is 0. The difference is 0.5.

The dot product distance. This is negative 15.5. But the calculation for this is just like like 1 times 1 plus 2 times 2 plus 3 dot 3 times 3.5. Right. So we get 15 and a half back.

So that’s that’s really all that is. Anyway, when the like I said, the closer the the numbers that you get back are or rather the lower the numbers are for cosine is going to be the closer to 0. The the the the more similar the vectors are or the more similar the dimensions are.

So if we have these two strings here that match entirely. Right. So that’s now it’s 1.02 dot 3 dot 5. And we look what we get back here.

Now cosine is exactly 0. Euclidean is exactly 0. And the the dot product distance is negative 17.25. But again, this is just like, you know, like one.

One times one plus two times two plus 3.5 times 3.5. So like this number is just getting lower for whatever reason. I mean, we made the numbers bigger.

So the negative number got bigger, which is a weird thing to think about. But like I said, the lower numbers are the more similar they are. I use cosine throughout like the entire course because I use a Stack Overflow database.

And the sort of goal of using that and which we’ll talk about more in a later video is that like it’s probably more most closely matches what a lot of you will end up doing with it. Where, you know, we’re trying to find we’re using question titles to find similar questions. We’re using, you know, question titles and comparing them to answer bodies to find like maybe like, you know, good answers for a similar question title, stuff like that.

So like the cosine is really what you want to be using. But look what happens here when we change the lower string to use all negative numbers. Right. So negative one, negative two, negative three dot five.

Now, all of a sudden we get back very different numbers. Now, the cosine distance here is almost two, which is like the far end of this. No, it’s like for a cosine distance.

The numbers that you’ll get back are between zero and two. So like there’s not a lot of forgiveness in there. So like the closer you are to zero for cosine distance, the better.

Euclidean distance is now 7.8. And really, it’s just because we’re measuring like this thing is going positive and the other thing is going negative. So they’re just like, like again, GPS coordinates.

These are two different points on a map like these just got further apart on like whatever grid map weird tesseract these vectors exist in. And the dot product distance is now a positive 15.5 telling us that these are very, very different dots. Like I said, we don’t talk about dot product anyway.

Thank you for watching. I hope you enjoyed yourselves and I’ll see you in tomorrow’s video where we will talk about some, some more fun vectory stuff in SQL Server. Anyway, thank you for watching.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

SQL Server Performance Office Hours Episode 47

SQL Server Performance Office Hours Episode 47


I have a query that is timing out after 30 seconds and is mostly waiting on pagelatch_sh. It’s a clustered index seek for a small number of rows on a narrow table. The query doesn’t show up as part of a blocking chain. Any suggestions for next step in finding the problem? I don’t think it is last page insert contention because there are few inserts to this table
Did you ever use performance monitor for troubleshooing and, if so, what metrics did you use ?
I never want to use DBCC SHRINKFILE on a data file again. Do I lose anything by totally abandoning it?
Could you please send me the link of the SQL books
On a client call recently, you were pretty specific about avoiding usage of the OBJECT_ID() function. I noticed you used it in a demo recently, fine enough (it’s just a demo). Any other such functions we should avoid using in production?

To ask your questions, head over here.

Video Summary

In this video, I dive into answering five fascinating questions from my YouTube community, covering topics like performance troubleshooting and SQL Server best practices. I share insights on how to approach a query timing out due to page latch SH waits, discuss the limitations of using Performance Monitor for SQL Server diagnostics, and explain why avoiding certain functions like OBJECT_ID in production code can prevent concurrency issues. Additionally, I provide an update on my consulting and training offerings for 2026, including new pricing options and subscription models that cater to a broader range of needs. Whether you’re looking to improve your SQL skills or just want to support the channel, there’s something here for everyone. Stay tuned as we kick off January with more exciting content focused on AI in SQL Server!

Full Transcript

Erik Darling here with Darling Data. Back from my big five days off of pumping out YouTube content. And somewhere between rusted and rested. Somewhere in there. But ready to go. Ready to get back into the swing of things. And of course, by the time you see this, it’ll be Monday, which means it is time for office hours, which means it is time for me to answer five. Terribly interesting questions from you, my beloved watchers out there. So we’re gonna get into that. Before we do, down in the video description, there are all sorts of helpful links. If it is 2026 and you have a brand new budget and you are just looking for ways to spend it, I have rehauled and updated my consulting and training pages. The consulting page has some new offerings on it. Slightly lower priced entry points for people who maybe need some help, but don’t need giant buckets of hours. And I’ve also redone the training stuff. So if you want, you can, you can, now the everything bundle is everything, everything. It includes the normal everything performance bundle, learn T-SQL and performance engineering.

And there are new ways to purchase that as well. There are yearly and monthly subscriptions if you are more keen on that sort of arrangement for purchasing training. So all that is overhauled and brand new for 2026. If you would like to, if the content that you see here just moves you in such a way that you would like to give me four bucks a month, buy me half a New York coffee, you can still become a channel member. That has not been, that has not been overhauled yet. Who knows what 2026 will bring. And of course, that is also where you go to ask me office hours questions.

So if you have a burning SQL Server question that you want to see answered live here, well, semi live here, I’m alive, I promise. Here on YouTube, you can do that. And of course, if you, if you think that I would be a valuable presence in someone else’s life, please do like subscribe and tell a friend so that I continue to reach the people I am intended to reach. I will be leaving the house. Well, a couple months from now, apparently data to Nashville, March 6th and 7th tickets are flying out the door for that. So hurry up and get your seats while you can. I have not been informed of the winged status of the data data Saturday Chicago pre con, but I assume that it is that is also quite winged. That will be March 13th and 14th and newly added to this list.

And why I had to remove the little flying database image over here because it was getting in the way of things is a SQL day Poland in Rocklaw, which I hope I am saying sort of right. And that will be May 11th to 13th in Poland. So you’ve got that going for us.

And welcome to, of course, the first video of 2026. We are all hopefully happy and not hungover. I count six databases and 12 arms and 12 legs.

So we are in good shape for 2026. Everyone’s got all their appendages, eyeballs, other other necessities, apparently. Plenty of well, they’ve all got drinks, too.

Looks like this is a good party. So is there, it looks like there’s a little pickle person in this one or something. I’m not sure what that is.

I had not noticed that before. Hmm. I wonder, I wonder what’s going on there. Anyway. Can’t, can’t win them all. Let’s answer some questions here.

Do, do, do, do, do. Ba, ba, ba, ba. Let’s see. The first one. What do we have here? This is all the big one, huh? I have a query that is timing out after 30 seconds and is mostly waiting on page latch SH.

Okay. Let me stop you right there. What does mostly mean? Does mostly mean 29 seconds?

Does mostly mean 29 milliseconds? What is mostly? It’s a clustered index seat for a small number of rows on a narrow table.

The query doesn’t show up as part of a blocking chain. Well, I don’t know. I don’t know how you’re judging that, but okay. Any suggestions for next step in finding the problem?

I don’t think it’s… Why would it be… Is it a select query or an insert query? Why can’t… Oh, I beg you, please give me the information that I need.

Please. I beg. So, look, the next step in this is always one of two things. You either get the actual execution plan, or you monitor the server for this query, and you see what is happening in near or around when this query is executing.

Those are the only two next steps. Or rather, those are the two most logical next steps. Because unless you are able to figure out what’s going on here, either via executing the query and seeing where things are actually getting stuck in the actual execution plan, or if…

I mean, because that… Which could totally not do anything for you. You could totally not see what’s interesting there. You could hit F5, it could finish instantly, and you’re done. Then you need to monitor things. You could use extended events or profiler or something else to watch live action on that table and figure out what is happening when that query is timing out.

I find it highly unlikely that page latch sh is your problem there. So that’s what you can do. I cannot tell you why with the set of information you’ve given me.

Next up. Did you ever use performance monitor for troubleshooting? And if so, what metrics did you use?

So no, not really. I tried a lot because, you know, like when I first got into like SQL performance stuff, there were a lot of articles and, you know, things about using Perfmon and exporting it to Excel and blah, blah, blah.

I just… I could never… I could never like wrap my… I could never get it to like give me the answers that I needed. So I couldn’t really tell you which metrics to use.

I can tell you that there is a nice fellow in the community who hasn’t blogged in quite a while, but used to have a really neat sort of… Not a traditional like query and index tuning thing, but like performance monitoring and like, you know, a lot of like hardware stuff.

The blog was SQL Sasquatch. And it’s a guy named Lonnie N… A bunch of underscores.

And he talks about Perfmon quite a bit in his blog posts and would list out the counters that he would use for various investigations. Yeah, like I’ve used the like sys Perfmon counters view in SQL Server to like look at some stuff, but it’s…

I’ve never like… I’ve never found a ton of answers there. Like every once in a while there will be something interesting in there, but mostly it’s just like… Yeah, look, you’re doing…

Actively doing things. Good for you. Like… So no, it’s never really been my thing. And then, you know, like with Perfmon too, you know, a lot of SQL Server monitoring tools sort of came around and gave you like a better picture of what was happening on SQL Server than you would get from those insane charts and graphs that Performance Monitor would spit out with all the different sort of like value resolutions and lines and stuff.

Like one thing that always tripped me up with that was like different lines had like different… Like sort of like…

I forget what the exact word is, but like resolutions to them. So like one of them would be like 1.000 something and it would always be up at the top and the other ones would be like 0.0001 something and always be like down at the bottom.

And you’re like, how do I make… How do I make sense of what’s happening here? There’s all sorts of like complicated stuff that just always made it a little bit too hard to use for me. Okay.

Okay. I never want to use dbcc shrink file on a data file again. Okay. Do I lose anything by totally abandoning it? On a client call recently, you were pretty specific about avoiding usage of the object ID function.

I noticed you use it in a demo recently. Fine enough. It’s just a demo. Any other such functions we should avoid using in production? Well, so like when I tell people to not use those functions, the reason why is because they do…

A lot of them… I don’t have the full list of ones, but a lot of them do not obey the same locking semantics that you might find by just accessing various system views to get that…

To resolve certain object names and stuff instead. Aaron Bertrand has a post in his Breaking Bad Habits series over on sqlblog.org, where he talks about, you know, like queries that use these, that get blocked where hitting sys.object, sys.scheme, sys.table, stuff like that, don’t get blocked.

So the main reason why I tell people not to use that in sort of like production code is because if the… Like you might… Your queries that run those might get blocked in places that queries that just use the underlying system tables and views wouldn’t get blocked.

So they can contribute to sort of like concurrency issues, blocking chains. I don’t know about deadlocks necessarily, but they could certainly get blocked in places where you would not find blocking from hitting the underlying sys views that would allow you to resolve those things in different ways.

You know, of course, the object ID functions are… You know, they’re quite simple for… You know, the type of procedures that I typically write for troubleshooting stuff.

But even I’ve tried to get away from using those in a lot of places there. And I just, you know, try to use the underlying system views instead so that I don’t get blocked up trying to resolve various object names to see where they are.

I’m not perfect at it, of course. You know, there’s lots of old code that I would have to go back and revisit and rewrite queries to sort of change the way that they are figuring out if various objects exist or not.

But, you know, one thing at a time. There’s only so many hours in the day and usually that stuff gets fixed as I hit a problem with something.

And if I’m not hitting a problem with something like pretty regularly, then I’m probably not going to jump on changing the code because, you know, it’s a lot of effort for not much.

Anyway, could you please send me the link of the SQL books? So I don’t know exactly what you mean here.

There is, of course, erikdarling.com slash books, which has a list of SQL books that I recommend. I think you might mean that one.

And then I guess that has to be it. So sure, I’ll put that link down in the video description with the questions and stuff. It’ll be in there.

So anyway, that was an interesting start to office hours of 2026. Thank you for watching.

I hope you enjoyed yourselves. I hope you learned something. And I will see you in tomorrow’s video. We’re going to spend January… I’m going to be doing little snippets of my new course, Get AI Ready with Erik, all about, you know, handling vector stuff with SQL Server 2025 and Azure and all where Microsoft has sprinkled AI into the database.

And by sprinkling in AI, I mostly mean there’s a vector data type and there’s a whole lot of stuff that you have to think about when you start using it. And hopefully, I’ve thought about most of it for you and I can just tell you what to do.

So that’s going to be what we’re going to focus on for the majority of this month. But I also have some bit obscene podcasts lined up with Joe and Sean and maybe some special guest appearances.

So it’ll be a busy January because we got to stay busy. And I will see you over in tomorrow’s video where we will talk about some vector stuff.

All right. Thank you for watching. Goodbye.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

Column Store, Delta Store, and More

Column Store, Delta Store, and More


Video Summary

In this video, I delve into some fascinating aspects of columnstore indexes and the Delta store in SQL Server, showcasing how data is inserted and managed within these structures. I walk through an interesting demo where I create a table, insert rows multiple times to observe changes in the Delta store and columnstore index, and discuss the tuple mover—a background process that compresses row groups but can sometimes disrupt my carefully planned demonstrations. Despite this little hiccup, the video offers valuable insights into how SQL Server handles data insertion and compression, providing a deeper understanding of these crucial components for database performance tuning.

Full Transcript

Erik Darling here with Darling Data, celebrating another beautiful Friday in the life of a SQL Server consultant. Lord knows why. In this video, we’re going to talk about some stuff with the columnstore, the Delta store, and a little bit more. This video doesn’t have one sort of overarching point to it. It’s sort of a few different spectacles, all in one lovely demo. So, we’re going to initiate ourselves, we’re going to avail ourselves of some interesting stuff. But before we do, I need to harass you about some other things, some financial priorities. If you need help with SQL Server, I am, like I mentioned, a SQL Server consultant. I specialize in SQL Server performance tuning, and you can hire me to do that for you. I will make your SQL Server faster in exchange for money. There’s a link down in the video description that offers you one of many ways to hire me to do that. There are also links down in the video description where you can buy my training, which is always reasonably priced. You can also choose to support this channel with money, as low as $4 a month, you know, pre-tax in my pocket, in order to keep me doing this sort of thing.

You can also do some other stuff, like ask me office hours questions. I do enjoy an office hours question, or five, every Monday. And of course, if you enjoy this content, you should probably like, subscribe, and tell the giant, massive friends who hang eagerly on your every word about this channel. So, I will be out and about in the world, oot and a boot for the Canadians. Data to Nashville, March 6th and 7th. And data Saturday, Chicago, March 13th and 14th. I will have advanced T-SQL pre-cons at both of those.

So, please do, please do show up. Because if you don’t, I will cry and drink and it will just be a real meltdown nightmare disaster on stage. And you don’t, you know, I mean, maybe you do want that. Depending on, depending on how much you’ll, you may enjoy that actually. I don’t know. But anyway, let’s go, let’s go talk about this columnstore stuff. So, what I’m going to do is, I’m going to do, well, of course, I’m going to drop a table if it exists. And I’m going to create a table. And then I’m going to load some initial data into a columnstore table.

And then, just for fun, what I’m going to do is, I’m going to reload those same rows into the columnstore four times. Right? So, we’ll say go three here. And every time we run this, so go is fun because it’ll start an execution loop when you put a number after it. And so, every time we insert rows into this table, we’re inserting a larger number of rows. Right? So, it’s like another million and then two million and then four million go into this table.

Now, something kind of interesting happens on this insert. And if we did this insert differently, and I’ll describe what differently means in a moment. But when we do this insert, SQL Server chooses to take data from a columnstore, spool it into a row mode eager spool, and then insert it into the columnstore. Right? So, this is, again, row mode. So, SQL Server unbatches all those rows to insert them into the table.

If we put that data into a temp table or even a temp table with a clustered columnstore index on it, SQL Server would not unbatch it. But because SQL Server row modes all that stuff, we end up with a bunch of sort of Delta store data. Right? We can see we have this clustered index.

So, the reason why this is funny is because if you go to the documentation for the view over here that we’re looking at, Sys.dmdb index physical stats, it’ll say, where is it? Down here at the bottom, maybe. Oh, no. It was up higher. This thing is lying to me.

Limitations. Does not return data for clustered columnstore indexes. Right? And, well, I mean, that’s true. There’s no data about the columnstore index in here, but there is data about the Delta store, which is a clustered index that sort of brain leaches onto the clustered columnstore and shows us uncompressed stuff.

Now, SQL Server does have a background process called the tuple mover. That’s when we ran that that time. There were fewer rows in there. So, SQL Server does have a background process called the tuple mover, which will, in the background, go find open row groups and attempt to, sorry, close row groups and attempt to compress them down.

So, when we look at the table now, we see we have some of these compressed. We have one of these that’s open. We have three of these that are still closed. Now, that’s all a side effect of, again, the row mode eager spool between the table and the insert.

So, now, if we run an alter index and we say compress all row groups, well, this will run for a little bit. And then, if we look back at this, now we’ll have these two tombstones in here. So, SQL Server is, like, sort of, like, compressed a bunch of other stuff, but left these two tombstones around.

Right? So, if we go and look at, like, what’s in here now, we go look at the physical stats for this. Now, we just have this one thing remaining. Maybe the tuple mover kicked in and did something. I don’t know. Maybe the tuple mover is ruining my demo.

Yeah, it is. See? Now, those two tombstone row groups are gone. Normally, I would have to run reorganize again a second time to force the tombstone sort of cleanup process to run. But this time, the tuple mover beat me to it.

I just timed this demo terribly because every other time I had lots of time to talk about this stuff before the tuple mover did anything. So, now this reorganize won’t do anything. But I do think that there is one interesting element to this in that if we look back at the physical stats for the columnstore thing, we still get one row back from here.

And there are a couple columns in that view that do reference columnstore data. Apparently, this is non-clustered columnstore stuff because, like, this just says not valid for the clustered columnstore one. So, I guess non-clustered columnstore, it does show more.

But what this is showing us now is an empty delta store. All right? Because we have no pages and we have no records in there. All right?

So, this is the empty delta store. SQL Server doesn’t remove the delta store entirely. It just leaves this empty thing hanging about just in case we need to put more data in there. If I ran, like, a bunch of inserts again to this table, then I would end up with probably rows in the delta store as long as SQL Server continued to choose that eager row mode eager table spool in between the insert and the select. So, as long as we still have this thing unbatching rows, then we end up with a delta store.

Even though we are inserting a number of rows that qualify for compression. So, that’s fun. Anyway, I found this all terribly interesting and also terribly annoying.

And despite the tuple mover kind of ruining one of the highlights here on needing to do the double reorg, I’m still okay with it because, you know, at least we got to see the tuple mover work in action, right? We got to see that it actually does something.

It doesn’t just sit there idly waiting for us to do things. It wakes up about every five minutes or so and says, Oh, look at all these row groups. They’re closed.

I got to compress them. And it does that. You can turn that off, but we’re not going to talk about that. Anyway, it’s Friday. I’m tired. I don’t want to do this anymore.

So, thank you for watching. I hope you enjoyed yourselves. I hope you learned something. And I will see you back Monday with our usual office hours thing that we do on Monday. Because that’s what we do on Monday.

All right. Have a great weekend.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

Merry Christmas from Santa and Definitely Not Erik

Merry Christmas from Santa and Definitely Not Erik



Would you believe Erik Santa forgot to turn off this coupon?

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.