SQL Server Performance Office Hours Episode 52

SQL Server Performance Office Hours Episode 52



To ask your questions, head over here.

Summary

In this video, I delve into a series of questions submitted by viewers during our office hours session, offering detailed and thoughtful answers to topics ranging from the nuances of parameter sniffing and memory grants in stored procedures to the complexities of index maintenance, query tuning for cloud environments, and the trade-offs involved with using columnstore indexes. I also discuss the impact of reducing CX weights on query performance and stability, as well as why batch mode can sometimes slow down OLTP queries despite making analytical queries faster. Additionally, I explore the challenges of achieving accurate statistics updates in a full scan scenario and touch upon the architectural considerations when implementing columnstore indexes for hot OLTP tables. If you have any questions or topics you’d like to see covered in future office hours sessions, feel free to ask them through the link provided in the video description.

Chapters

Full Transcript

Erik Darling here with Darling Data, and you may have guessed, based on various temporal elements, that it is Monday, and Monday means it is time for office hours, where I answer five, as long as I count correctly, user submitted questions to my highly advanced Google Sheets document, and do my best to give good answers, good thoughtful answers, to those. You can find the link to ask those questions down in the video description. While you’re on your way to that link, perhaps, to ask a question, you will see all sorts of other mutually beneficial links down there, in which you can hire me for consulting, which is a great idea. You can buy my training, which is, likewise, a fantastic idea. You can become a supporting member of the channel. For as few as $4 a month, you can say, good job, Erik. Thank you for spending hundreds of hours a month, recording YouTube videos, and coming up with content for us. And then, of course, you can ask that office hours question for free. Amazing. Is there no limit to my generosity? Yes. No. Not sure. And, of course, if you enjoy this content, if you think that someone else in your life may also enjoy this content, please do like, subscribe, and tell a friend, because that’s the nice thing to do. And that is free for all of us. So, I will be out in the world. You know, it is February, currently, at least, I mean, it’s not February when I’m recording this, but it’s going to be February by the time you see this. So, I don’t know, you can maybe guess when I’m recording this based on that, but I’m not leaving the house for a bit. It just snowed like 52 feet here in New York. It looks disgusting already. It is various shades of not white.

out there in the snow. But, but, but as soon as that snow melts, as soon as that, that snow abates a bit, I will be back out joining the world. Baseball will be in the swing of things. It’ll be a marvelous time for all of us. Data 2, Nashville, March 6th and 7th. Data Saturday, Chicago, March 13th and 14th. SQL Day, Poland, May 11th and 13th. Look, I missed the, I missed, I missed the on Poland. I gotta fix that. And of course, Data Saturday, Croatia, June 12th and 13th. At all of these events, I will be teaching my advanced T-SQL pre-con. Because I only have 6.5 hours roughly to teach my advanced T-SQL pre-con, and there is much more to say about advanced T-SQL, all attendees will get free access to my learn T-SQL with Erik course, which encompasses both beginner and advanced material and is far, far longer than the 6.5 hours these training days allow.

So, if you want to get a tremendous amount of bang for your buck or whatever currency these other things use, you should come to that and have a great time. I don’t know. Maybe we can both learn something from each other. But with that out of the way, we have flipped the calendar unofficially to February here. Still winter. Sucks.

Asked ChatGPT to come up with a good, like, shining-y, you know, like, I’ve been stuck in the house all winter vibe. And, you know, aside from this one up here, which has, like, some extra fingers, and that one over there that has a backwards axe, I think, you know, like, the face in the doorway over here is particularly good. I like that one. So, I don’t know. I have some time to think about March. I don’t know what March is going to be yet.

We’ll find out when we get there. But, anyway, let’s answer some questions, right? Let’s do that thing that I promised you I would do. We lowered MacStop. We! Like, it took more than one of you? How many people does it take to change MacStop? I guess there’s a joke in there. How many DBAs does it take to change MacStop? Anyway, to reduce CX weights.

All right. That’s a reason to do it, I guess. And queries now run slower, but look more stable. Is that a net win? Well, of course, your queries run slower. You reduce MacStop.

Let’s just theoretically say you reduce MacStop from 8 to 4. You now have four fewer cores with which to process your queries with. There is some chance that they would get slower by, like, half, right?

So, if your query runs in two seconds at MacStop 8, and you reduce that query to MacStop 4, it might run for, like, two seconds instead of one. So, what I’m struggling with a little bit here is when you say, look more stable. I don’t know what look more stable means.

Is this like a tortoise versus the hare situation? Is this slow and steady wins the race? I don’t know what more stable means. You’re going to have to give me some more information about the stability you speak of, because I have no insight into what more stable means to you.

Is it a net win? I don’t know. Are slower queries ever a net win? I suppose if you are hitting thread pool weights or, you know, like, very, very high CPU percentages constantly, then perhaps it is a net win.

Perhaps your server is just a bit less stressed out, worn out, running out of resources. I don’t know, but golly and gosh, I don’t know if I can call that a net win, because I don’t know what you’re measuring stability with.

I don’t know what criteria you find these queries to be more stable by. So, yada, yada, yada. Batch mode made our analytical queries much faster, but OLTP queries slowed down.

Why does batch mode sometimes hurt mixed workloads? Well, a lot of it, actually, I’ll give you the prime thing that I see when batch mode on rowstore, which I assume is what you’re talking about here, because, you know, OLTP, like, why else would that be a thing?

I’ll give you the primary thing that I see is the optimizer decides perhaps that some facet of one of your OLTP-ish queries would benefit from batch mode. So, it could be an aggregate, but more commonly, it is a join.

And most commonly, what happens with joins is that SQL Server starts choosing adaptive joins, which aren’t a bad thing on the face of them. I’d rather appreciate SQL Server no longer just YOLOing with a hash join or a loop join, regardless of how many rows actually start showing up.

But what it can do is take what is typically a nice, efficient, usually single-threaded nested loops join plan and throw a little bit of a monkey in that wrench by starting the plan off with a batch mode hash join and switching over to a row mode nested loops join if enough rows are not emitted from the outer part of the join.

So, and that is primarily what I see happening, is that, you know, what used to be just a very quick, simple navigational query all of a sudden starts getting weird. And, you know, I’m not saying that you should be scared of batch mode on rowstore, but there may be certain queries that you might have to intervene with and say, please don’t do that.

Like, the easiest way is just to say option loop join for those. But typically, that is what I see. All right.

We tuned queries heavily. I mean, I wasn’t there, so I don’t believe you. But didn’t save much money in Azure. Well, no kidding.

An overpriced resort that that is. Why doesn’t query tuning translate directly into cloud savings? Well, let’s think about that for a moment here. You tuned queries heavily.

And let’s say in the course of your heavy query tuning, you reduce CPU from like 80% to like 30%. Typically, where one starts to realize cost savings in the cloud is when one reduces the size of their cloud hardware. So if you, let’s say, you know, tuned all your queries and made things better in that way, it would have to coincide with some reduction in cloud resources to see money get saved.

It kind of reminds me of the question where it’s like, I got rid of all this data in my database, but the database file is still huge. What happened? It’s like, well, that database doesn’t shrink itself, right?

That Azure instance doesn’t shrink itself. So, you know, what are we going to do here? But anyway, my rates are reasonable and I’m happy to help you save money in the cloud.

Aren’t I nice? We run full scan stats nightly, but estimates are still awful. Why does perfect sampling still produce terrible plans?

Well, it may be a full scan sample, but let’s consider a couple of things about statistics, shall we? First is that you, no matter how big of a scan you use, whether it’s full or not, SQL Server still populates the same 200 step histogram with whatever values it finds during that scan, whether it be sampled or full.

It’s not like you produce bigger histograms just because you do a full scan of the stats. You might already have a two-step histogram. The full scan stat might describe certain portions of that histogram more accurately, but there might still be a lot of information that is not well captured by that.

If that’s the case, you might consider filtered indexes or statistics in order to give the optimizer more information about portions of your table data that is not well represented by the histogram.

The other thing to consider is that you might be writing your queries in stupid ways. You might be doing all sorts of things that may inhibit SQL Server from accurately guessing cardinality for things.

Local variables, table variables, non-stargable predicates, those are some things that immediately sort of spike the ball on that. There’s all sorts of other things you might be doing. You might just have terribly complex queries, which would benefit from being broken down into smaller chunks.

So you cannot rely on statistics updates for everything. They are one part of the picture. There are many other parts of the picture for you to consider.

All right. I wonder if this is the same person. It sounds pretty similar. All right.

Let’s see what we got here. I don’t know why this square is so much bigger. columnstore fixed our reporting queries but wrecked inserts. Really? columnstore wrecked inserts. That’s interesting.

What architectural trade-offs should we expect when adding columnstore indexes to hot OLTP tables? So my initial surprise here is because I typically don’t see columnstore doing too terribly with inserts.

And the reason for that is that, you know, since you’re talking about OLTP, I would assume that you are talking about the OLTP inserts, which are usually a very, very low number of rows.

Generally one, but, you know, I would say maybe up to 100 or something would be reasonable for an OLTP workload before it starts, you know, getting into different types of, you know, different, you know, families of queries, like analytical stuff.

So I’m a little surprised at that because those inserts typically won’t touch the columnstore itself. Inserts that are too small to qualify for instant compression into the columnstore just get inserted into the Delta store, which is sort of a clustered B-tree rowstore brain leach on your columnstore.

So I’m a little surprised at that. If you had told me that updates or deletes were having a tougher time because of the columnstore, that I would believe because, you know, updates with a columnstore are, you know, pretty much the, you know, you delete the old row and insert the new row, and then the new row gets inserted into the columnstore, but you still have to like decompress the old row and all the other stuff.

So a lot of things get, you know, there’s a lot more going on behind the scenes with updates. Even deletes usually aren’t like that beat up by it. You know, you do like just sort of use the deleted bitmap thing in there and like mark the row as a tombstone row or a ghost record or whatever the one million different things Microsoft has called it is.

So I’m not sure why your inserts were hurt so badly by that. As far as architectural trade-offs go, well, you know, mostly it’s the stuff that I just talked about.

You know, you get your either reporting style queries or your kitchen sink style queries where, you know, like sort of, you know, you don’t know what the where clause and the select list and the order by is going to be.

It’s non-clustered column stores. Marvelous for getting those types of things sorted out. But, you know, as far as architectural trade-offs go, what some people will do, and I am an occasional proponent of, is to sort of, you know, separate your data out a little bit, have your hot data that might be seeing frequent modifications sit in a normal rowstore table.

And then sort of once something is past the point of, you know, modification, like frequent modification, you might like move that to sort of an archival thing that is, that has a columnstore index on it.

That’s probably the most common of them. In extreme cases, some people will even use like an in-memory table for the really hot data. It’s sort of like a shock absorber table.

A few of my clients in the online gambling space and in the high-demand consumer shopping goods space is spaces.

I don’t know if that’s, I don’t know how many that actually spans when I think about it. I think maybe it’s just all one space where people buy too much crap. The in-memory OLTP thing is very useful for them.

But if you just have like a generically like hot workload for recent data, like, you know, the last week or month or something, you know, you could keep that in a rowstore table.

You could, you know, move that off in like a big batch to a columnstore table and see most of the benefit of having columnstore for your more historical data and rowstore, which has less insert, update, delete, overhead for the hot data.

It’s just sort of up to you to figure out how long something is sort of useful for. For a lot of people, it’s not a very long stretch. You know, you know, you place an order for something like what happens later, like, you know, like you get like a ship date, right?

Like, and then you get like maybe some tracking notifications or that’s not even like in the table itself. And then you get like a delivery date, which probably updates, you know, the order or something.

But, you know, for the most part, like, you know, most stuff is done within like, an hour to days of it, of its lifetime there. You know, like even things like stock trades, you can only do that once.

I mean, you can place a whole bunch of trades, but it’s not like, like after you make a trade, you’re like, oh, wait a minute. I meant three, no, 12. And then like the day goes by, you’re like, ah, I meant 15.

Like there’s a lot of stuff that, you know, like it’s, it’s, it’s hot time, right? It’s moment in the sun is, is fairly short. But like, even like an Amazon order, which is kind of like what I was just talking through, unless someone like cancels it, but that even that’s going to be, have to be pretty quick, right?

It’s not like, it’s not like months later, someone’s going to be like, I’m still waiting on that cancel, right? It’s kind of silly. Anyway, that is a sort of architectural stuff that immediately comes to mind.

I don’t know more about your situation. So it would be pointless and fruitless to, you know, try to think of sillier situations you might be in.

So if you want more details, well, my rates are reasonable. Anyway, thank you for watching. I hope you enjoyed yourselves.

I hope you learned something and I will see you in tomorrow’s video where we will do some other SQL Server stuff. All right, thank you.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

Get AI-Ready With Erik: Generating Chunks

Get AI-Ready With Erik: Generating Chunks


Summary

In this video, I delve into the intricacies of managing long content when using embedding models like those from MXBAI and OpenAI ADA002. With token limits being a critical factor, I explain how these limits can be reached quickly in scenarios such as Stack Overflow posts, where question bodies and answer bodies can easily exceed the 512-token limit for some models or the more generous 8191-token limit of others. To address this challenge, I introduce Microsoft’s chunking method, detailing its components like source text, chunk type (fixed), size, overlap, and how they interact to preserve context while respecting token constraints. 

Chapters

  • *00:00:00* – Introduction to Chunks and Token Limits
  • *00:03:45* – Stack Overflow Database Considerations
  • *00:07:28* – Example Text Chunking
  • *00:10:09* – Overlap in Chunks
  • *00:12:04* – Conclusion

Full Transcript

Erik Darling here with Darling Data, and we’re going to finish out this week talking about a subject that is near and dear to my heart, a subject that I have a lot of experience with, and is dealing with long content. Because, you know, I have written some pretty long blog posts in my life. I’ve written some extensive training material in my life, and I’ve got to manage a lot of long content day to day. So, we’re going to do that. This is, of course, all videos, all snippets, tidbits, tiny chunks, right, near shadows of the full course material from my class course, whatever you want to call it, my correspondence course. You can become a locksmith, repair guns, crack safes, get AI ready with Erik, get AI ready with Erik, where you can learn in full, gruesome, gory detail. About all of these things. And, you know, so we’ve got that going for us. And my green screen is acting a little funny in the background, but not so funny that I’m willing to stop.

So, embedding models, as you either know now or will know in the near future, have things called token limits. If you’ve ever used an LLM to any degree, you know, you might have, you know, if you’re using like a web interface, you might see something like, this conversation length is done. That’s enough. We’re over here. We have reached our context. And this is token driven. And if you’ve ever used, you know, a more professional grade LLM product, like say, Claude Code or something like that, you may have noticed it slowly counting up cursors as it does, tokens as it does things.

Where it’s like, you know, it’s like, you know, it’s like, thinking about it. I read this file. I’m thinking about it. And like, you just see like the number of tokens rack up and up. And then you hit a certain amount of tokens and they charge you more money. So, embedding models have token limits. The one that we use for the course, the MXBAI, MXBAI EmbedLarge has a 512 token limit. OpenAI ADA002 has an 8191 token limit. And there are some other ones that have much smaller token limits like the MiniLM, blah, blah, blah.

Stack Overflow content, for a lot of things, will fit just fine into the 512 token limit. Titles are usually fine because they’re less than 300 characters. But bodies are often long. They are often 1,000 to 10,000 or more characters. Some long content, right? Some pretty verbose answerers on Stack Overflow. And that’s even just for like question bodies. And of course, answer bodies, which is probably more in line with what I was just talking about, can be very long.

You’ll have code examples. You’ll have all sorts of stuff in there. You know, quotes from documentation that are extensive. Things like that. That may exhaust the token limit for your model. You don’t want an exhausted model, I can tell you that much.

And what’s, I think what’s especially, you know, maybe surprising to some of you out there is that you, if you exceed the limit of tokens, then the model just stops, right? It just silently truncates the rest of it. It just leaves a whole bunch of stuff out. Which, you know, can be not great if you’ve got a very long answer that’s full of very good information and, like, very important details.

And the first, you know, like, let’s say, I don’t know, 1,000 characters or so is just sort of like, you know, preamble. You probably ain’t training stuff right, right? You’re not going to have very good context for things. So, Microsoft has given us a way to deal with long content in which we can generate chunks.

I don’t know who names these things. Just, like, like, obvious barf references. Why? Why? Why are you going to call it chunks?

What are you doing over there? Ah, generating chunks. Feeling okay? No. So, generating chunks requires, well, requires three, but there is an optional fourth input to your chunk generator. There is the source with which you wish to generate chunks with.

There is chunk type, which currently, I believe now, you can only do fixed. And then there is chunk size, which is the number of characters with which you wish to chunk. Then you have this other thing called overlap.

And overlap can be interesting because if you’re dealing with something that’s, like, paragraphs, like, and you generate a chunk of, like, let’s say, I don’t know, like, 300 characters, but the paragraph itself is, like, 600 characters, the next chunk you generate is going to be just, like, the second half of the paragraph.

And so you might want to have your chunks overlap a little bit so that, like, you retain some context from, like, the first half of the paragraph into the second half of the paragraph, right?

So, like, you kind of want to, like, blend things a little bit, right? It’s kind of like your blender tool where you can say, like, well, you know, this is a really important, like, and you’re not going to be doing this line by line when you generate chunks or else you might actually generate chunks.

But, like, you’re not going to do this line by line, but just, you know, like, like, thinking about a situation where, like, well, like, you’re reading through a paragraph and you’re like, that’s an important sentence.

And then, like, the next sentence kind of, like, carries on with that. You’re like, like, that’s the kind of, like, contextual stuff that you want to carry across so that when you’re generating embeddings, they sort of, like, retain more context across the chunks that you generate.

But AI generate chunks will return a table-valued result with the text fragment that is being chunked. And that would feed into the AI generate embeddings function.

The chunk order, right? So you can see, you know, I guess, like, breakfast, lunch, or dinner. And then you have the chunk offset, which is the position in the source.

So maybe that was, like, an afternoon snack. And then you have the chunk length, which is the number of characters in the chunk. So just a very basic example would be something like this, where I have this text column that is just a string that I’m selecting in here.

And we have our, we’re going to just cross-apply generate chunks here. And we have our source pointing to this thing. And we have our, well, we can only say fixed here.

But then we’re going to just give me a chunk size of 50 characters, right? It’s not, that’s not a best practice. That’s not, like, what you should do. You have to figure that stuff out.

This is just the example I’m using based on the piece of text that I have there that made for a reasonable demo. So that’s what we’ve got, right?

Again, 50 might be great, right? Like, cost threshold for power might be amazing. I don’t know. Anyway, here’s what we get back. And here’s what we see in here.

Now, there are some weird, like, there are some things you have to consider with this. And, like, one of those things is very specific to the Stack Overflow database in that the body column in the post table, like, it has, like, HTML formatting in it, right?

It’s sort of, like, Markdown-y formatting kind of. But it’s all, like, there’s a lot of, like, brackets and stuff. Now, like, it’s not, like, you know, an embedding model is going to consider those things important, but they are characters that do contribute to your token limit.

So, like, if you’re, like, depending on the cleanliness of your data, like, and, you know, how many tokens your embedding model allows, you might have to seriously think about, like, cleaning out these, like, nonsense things, like, you know, like, BR and, like, you know, like, H1 and H3, stuff like that.

But this is what we get back. We get back four lines. We see the order of the chunks. And, like, we see each of these chunks on one line, and they’re all 50 characters, right?

But notice, like, also, like, there are, like, big spaces in it. Like, the spaces contribute to this too, right? So, like, spaces, like, any character, right, whether it’s useful or not, contributes to the chunk size.

If you have empty lines in your stuff, they, like, you’ll get empty lines back from generating chunks, right? So, like, it’s like, I am a line with all these empty things in it that will do no one any good whatsoever.

So, like, sometimes you might have to filter out, like, you know, you might want to, like, you know, pre-filter garbage from here and just say, like, well, you know, if you don’t have any, like, useful characters in you, I’m hacking you out because you’re no good, right?

So just this simple where chunk is, like, you know, like, A to Z and 0 to 9, right? So, like, anything that actually has that in it will keep. Anything else, go away, right? Like, you’re not words anymore.

Like, I don’t know, it’s, well, I mean, I say that from a very, like, you know, you know, like, anglo-centric point of view where these are my letters. If you can do that with your letters, right?

Like, lesson learned, make sure your letters are in there, your alphabet, whatever language you’re doing this in, make sure you’re represented. That’s mine, so that’s what I’m doing. But overlap is there to prevent losing context at chunk boundaries.

So, what each, like, each chunk will include some text from the previous chunk. If you set overlap to 10, that means you will have a chunk size of 10%. So, like, we say, we do all this and say overlap equals 10, right?

And we run this, then we get back this, right? We get back five rows, right? And, you know, like, embed, and embed, and, you know, like, we don’t, there’s not a lot of overlap there.

If we crank the overlap up to, say, like, 25, right? So, let’s make that a little bit bigger, then the results change a bit, right? Like, and, actually, go away, red gate. And then if we do 50, we will get far different results.

Now we get back seven rows of stuff. And, actually, it’s probably more, probably a little bit more illustrative to just run these all together, right? And just kind of see how things change across these.

And so, like, you know, you can kind of see that, like, you know, like the, these don’t exactly line up in the exact same way, right? Like, the alignment here is not the same across, you know, a 10% overlap and a 25% overlap. And then a 50% overlap, which actually, like, causes more rows to get produced by that because we end up having more chunks of 50 characters because we preserve more from the, like, the previous row.

So, if you have long content, and long content, of course, will vary by the size of your text that you’re dealing with and the token limit of the embedding model that you are using, you may need to consider generating chunks. Thank you for watching.

I hope you enjoyed yourselves. I hope you learned something. And I’ll see you next week. Another one. All right.

Goodbye. Goodbye.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

Get AI-Ready With Erik: Embedding Freshness

Get AI-Ready With Erik: Embedding Freshness


Summary

In this video, I delve into the topic of keeping embeddings fresh in SQL Server databases, a critical aspect often overlooked due to its complexity and lack of straightforward solutions from Microsoft. I explore why regenerating all embeddings can be impractical and discuss more efficient methods like using queue tables and triggers to handle incremental updates. By walking through the setup process and providing examples, including how to create a computed column (despite SQL Server’s current limitations), I aim to equip you with practical strategies to avoid embedding drift and ensure your data remains relevant and accurate. Whether you’re dealing with user profiles, content management systems, or any application that relies on embeddings, this video offers valuable insights into maintaining the integrity of your data.

Chapters

Full Transcript

Erik Darling here with Darling Data and more AI embedding goodness for you fine folks out there in the world. We’re going to talk in this video about embedding freshness or keeping embeddings fresh in SQL Server because right now there is not a terribly good story with doing that. So embeddings of course get stale if the source text is edited, new records are added, or records are deleted. Your options are to regenerate all of them, which can be slow and expensive, or you could use some facility inside of SQL Server to track changes and update incrementally, which is probably the smarter thing to do. We are not going to do the stupid thing because the whole point of this course is to help you avoid the stupid thing, right? Don’t do the dumb thing. If you can take anything away from all of this, don’t do the dumb thing. Let Microsoft be your cautionary tale. Don’t do the dumb thing.

So again, coming back to our dear friends at Microsoft, you would think that they would make this easy, but no. Apparently they didn’t want to make this easy, right? And I don’t know. I kind of don’t blame them on this one, but it would be nice if this were a little more convenient. So like if I wanted to create a table and I wanted to say, hey, I want a computed column. You can already see this squiggle.

Now, be fair here. All right. SSMS puts red squiggles under a lot of stuff. It’s completely valid. Like, just has no idea what’s going on most of the time and is lost. This red squiggle is, this is a legitimate red squiggle, right? This is there, right? There for a reason this time, at least. But we cannot create this table and the error messages are rather amusing.

It will tell you that the database model doesn’t exist. Make sure the name is entered correctly. I think I get this one because my, my SQL instance is case sensitive, but it’s still, it’s still funny. Right?

That might make you freak out. Uh, but we’ve got incorrect syntax near the keyword use. So, uh, we can not create a computed column that will just generate this stuff for us. SQL service is like, nah, no.

And who knows? Maybe we’ll get this someday, right? Maybe this is a preview feature too. We don’t know, right? We, who could, who could, who could foresee the future when there’s no communication between the PMs who manage things and, and the end users who care about things, right? Why, why would we ever want to know these things?

But, uh, what you would generally want to do is in this, you know, again, uh, something that is very database, right? Something that is very sort of natural to, um, you know, most people who have had to manage databases is to create a queue table, which we could do here. And we can do that.

And we can set this up in a way that will give us, uh, something to work off of in which we can tell what, what action we need to take on a row and sort of just process through this queue table and do our updates. I’m going to give you an example trigger for, uh, for inserts. Um, in real life, you would also want to create other triggers to handle updates and deletes, right?

You would not want to put everything into one trigger because life gets real weird and complicated when you do that, right? All that sort of checking the inserted and deleted tables, uh, make separate triggers. What’s wrong with you?

But, uh, you would do something like this where you would do the normal things, the, the, the, the standard, the canonical trigger header, right? You bail out if row count big is zero. Uh, you make sure that no other sessions have interfered with important settings, set options, uh, at the, at the session level.

And then you would insert into your queue table, this stuff, right? These, this information from, um, when stuff got inserted into, uh, the, the, the post table, right? So like data goes into the post table.

We put that row into the queue table and then the queue table would process, right? Then we’d have like something else that would process, um, all the other stuff. From here, you would probably want, you know, a cursor, uh, like a, like a separate, you would want to store procedure that like works off the queue table.

And you would want to like loop over it, right? Cause again, like generating embeddings is not free, right? It’s an external call.

It’s outside the database. SQL Server does not generate embeddings on its own. You have to call out to a separate service. It could be a local service on your VM, uh, with a mix of Olama and something that helps SQL Server talk to Olama. Uh, cause Olama only has HTTP, uh, as a, um, as a protocol does not have HTTPS that SQL Server requires.

So you need something that sits between, um, you know, Olama and SQL Server to sort of deal with the HTTP, HTTPS stuff. Um, and, uh, you know, again, it’s not free, right? It’s not happening within the SQL engine.

It’s outside SQL. And if you were to use open AI for this and you had to make an internet call, God bless. You know, I used to make fun of people. Cause like, you know, like I’ve, I’ve worked with lots of like, you know, clients who have had to install software made by people who hate them.

Right? Like I made, I made a little joke in, I think the last video about like, I’m a software developer who loves databases. Like you would have the people like make, you know, like they have a trigger on a table that would use XP command shell to call like curl.

And, or like call like an executable that would go do something on the internet. And you’re like, Oh, of course it sucks. So what’s the matter with your face?

Like why, why would you do that? But here would you, you would, if you just had a store procedure, right? Like, you know, again, like, you know, row at a time or batch process this stuff, you would want to find any pending items. Um, if it’s a delete, then, you know, it’s easy enough to just delete a row.

Um, like I only had an insert trigger up there, but again, you would want, you know, update and delete triggers to handle other modifications to the table. And if merge fires, God help you. Um, and then you would want to delete and reinsert any rows that were marked as an update by the trigger.

And then if it was an insert, then you would just stick the new row in. Um, you know, you could also use last activity date to find change posts, to find any posts that were altered. Uh, you wouldn’t need triggers for that, but you would like, you would have to pull the table constantly for stuff.

Um, the trade-offs there are of course on the pro pro side, like, you know, triggers can be, especially under concurrency triggers can be tough to, you know, get right. Um, it’s a bit of a simpler setup, but, um, you know, then you like the, like the cons there are, you know, it’s, it’s possible that you might miss some changes. Um, you know, if you like paralyzing the process, like if you want to have like multiple workers, I’ll looking at the queue table for stuff that can get complicated.

So, you know, a lot of locking hints and other things go into, um, go into making that work, you know, as flawlessly as possible. Uh, but then you also have like the overhead of, you know, pulling the table. And then, you know, when you find work to do going and doing work, but the basic idea behind either of them is that you, you know, find posts that without embeddings, uh, that need them.

Uh, you would find modified posts that need embeddings and re-embed them. And if you delete a row, then, you know, it doesn’t matter cause that doesn’t go into the polling table. Cause there’s nothing to do.

It’s just gone from the table. But that’s like sort of like, you know, some, like some ideas around how to keep embeddings fresh and up to date in your database. Because one thing that you don’t want is like embedding drift, right?

You don’t want like, you know, um, like say like our example in the last video was the about me thing in the users table, right? Let’s say someone’s like, I’m a software developer who loves databases one day. And then the next day they’re like, um, like I’m a hot yoga instructor, right?

Like, like, but your embedding would still say, I’m a software developer who loves databases. And so like the, like the headhunters from the stack overflow job board would be like, let me get you this job. And like, sorry, I’m just all hot yoga.

I’m just all hot yoga now, right? You want some hot yoga? I got you. But this, and they’re like, but the embedding said you, you were a software developer who loves databases. And everyone’s just walks away sad and confused and no one actually does hot yoga.

And well, what can you do? Anyway, that’s enough here. Thank you for watching.

I hope you enjoyed yourselves. I hope you learned something and I will see you in tomorrow’s video. The final video of the week, our Friday, our gal Friday. And, uh, we’ll, we’ll, we’ll, we’ll do that, I guess.

Thank you for watching.

 

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

Get AI-Ready With Erik: Generating Embeddings

Get AI-Ready With Erik: Generating Embeddings


Summary

In this video, I delve into the intricacies of generating embeddings in SQL Server 2025, Azure SQL Database, and Azure Managed Instance using T-SQL’s AI generate embeddings function. However, I caution viewers against performing these operations directly within SQL Server due to potential inefficiencies and external dependency issues. The demonstration involves creating a `users_to_embed` table with an ID column and a vector data type embedding column, illustrating the process of generating embeddings for user profiles that include their “About Me” descriptions. While this method works, it highlights the limitations and challenges of using SQL Server for such tasks, suggesting that Python might be a more suitable environment for handling these operations efficiently.

Chapters

  • *00:00:00* – Introduction to AI Generate Embeddings in SQL Server
  • *00:03:15* – Generating Embeddings for User Profiles
  • *00:07:24* – Finding Similar Users Using Vector Distance
  • *00:10:10* – Best Practices and Considerations
  • *00:11:43* – Conclusion and Next Steps

Full Transcript

Erik Darling here with Darling Data. And in today’s video, we’re going to talk a bit more about generating embeddings in SQL Server 2025, Azure SQL Database, and Azure Managed Instance. To do so, we are going to continue to invoke our T-SQL friend AI generate embeddings. But this time, what I want to do is kind of show you why you might just want to stick with Python on this. Do it outside of SQL Server for the love of God. So, I’m actually not going to drop in, recreate this table because I’ve already got it loaded up and you will, in short order, see why I did not decide to do this all from scratch.

We’d be wasting some time. So, this would be our sort of demonstration table, right? We would have all the things that we would need to be AI ready, like yours truly.

We would have an ID column, which is an integer, right? Because if we wanted to be forward thinking about someday having a vector index on this table, if they ever become not horrible, if they ever have to ever finish that damn thing, instead of like the Homer Simpson car features that we have now. So, we would want to have that up there, but who knows if that restriction will be relaxed when the vector indexes stop being horrible, right?

Like, maybe we’ll be able to use a big int. Like, hmm, you got fabric. Anyway, this is the table we would use and we would have our wonderful embedding column here, which is a vector data type that accepts a 1024 exactly dimension vector and is currently using the float32 model of storage rather than the float16 preview model of storage.

Now, in real life, you know, if I were running through this, I would want to stick maybe about 50 users into the table who have an about me column, which is a reasonable length, right? Because we’re going to be using the about me column for this one, right? For various reasons.

So, like, let’s say, you know, like, there’s lots of stuff in the Stack Overflow database that one could wish to embed, right? We’ve talked about sort of like the post table where you have titles and then you have bodies and, you know, stuff like tags, you know, you probably wouldn’t want to have, you probably wouldn’t want to do that. But then, like, you know, in the comments table, there’s the comment text column and, oh, screw comments.

But then, like, in the users table, you know, like, location, probably not, website, probably not. But about me would actually make a pretty reasonable embedding because what you could do is take the things that someone has written about themselves, right? The kind view that someone has taken on their Stack Overflow profile.

And it’s like, I am the best coder in the world. And you could search for people who might be good at, I don’t know, answering a question or maybe like even, you know, that Stack Overflow job board, which saved the company, thankfully. You know, hey, thank God we have that job board.

And, you know, you could even find maybe people who would be good matches for a particular job if you were to embed their About Me profile and, you know, do some semantic searching on it. Because you could do some similarity search between things in the job posting and what people are out there in the world. Right?

It’s like everything except the salary. But we could stick a bunch of users in here and we could, you know, just to keep, like, the embedding stuff reasonable. We would say that, you know, we want people who have an About Me that is somewhere between, like, one hundred and two thousand characters.

Beyond that, like, what the hell are you rambling on about? It’s like, let me just, like, paste, like, book verses in there. Out of your mind.

So we’ve got this users to embed table. And this users to embed table. And I realize that this sounds a little, you know, after dark. But trust me, it’s really just vector stuff.

We’ve got 50 users in here that we wish to embed. And, of course, the sort of, you know, like yesterday, the way that we generate embeddings in T-SQL is by calling the AI generate embeddings function, passing in a string that we wish to vectorize.

Right? In this case, I am a software developer who loves databases. I’ve never met one. Right?

They all seem to hate databases. It sounds like the passion. It’s like, oh, what can I do to this thing today and then complain about later? Right? Like, JSON. Vector stuff.

Yeah. Anyway. If we run this, we will get back our embedding JSON. Which, again, starting at line two and scrolling on down. All these numbers apparently mean that you’re a software developer who loves databases.

And so we get 1,024 of these numbers back. It’s fantastic news for us because our column is a vector 1,024. Isn’t that, isn’t it amazing the way this stuff just lines right up in a demo?

Right? How could it be better? But, so if we wanted to update just one user, right? Just one.

Let’s just show you that it works. Uh, we could set their embedding column to, uh, pass in a column. Right? We could say, we want to generate an embedding for the about me column. And they’ve got put into the user, demo user embeddings table, uh, just for the first user.

Right? So just, just one of them. And we run this. Right? And it’s, again, not the slowest thing in the world, uh, for, for one row.

But trust me, doing, doing the rest of them, uh, you know, you have like, like, I think most people who use SQL Server are pretty like comfortable with the idea that once modifications go past a certain like rows row count, you’ll probably want to batch things up.

You’ll probably want to think about that a little bit earlier if you are generating embeddings with T-SQL and, uh, calling this. This, because remember like, like, like what this thing calls out to is not inside of SQL server.

SQL Server does not generate embeddings. Right? SQL Server does not have any facility within it to generate an embedding with. Okay.

Uh, this is still an external model. You still have to call outside of the database, have something do something, then send it back to the database.

And then you get to do the update. Right? So I’m, I’m pretty lucky here in that, you know, um, like this is a VM and all of the processing that I’m doing on here takes place in the VM.

Right? But like, you know, if you look over here, you know, like we still have to call out to Olama. Olama has to do something and send it back.

Olama has to talk to this other thing because Obama, not Obama, Olama only allows, uh, only as right now only has HTTP requests and the external endpoints require HTTPS. So you need like a separate thing that has HTTPS that can talk to HTTP.

And so like, like right now there’s like steps to take. Right? And if you were using like open AI, you’re talking about like an internet call, right? So you’re talking about like reaching out to the internet.

So like, like probably want to think real carefully about like, like how many rows at a time you’re going to count on for this. Because, um, if I were to update all 50 of these rows at once, it would be much slower.

I’m just, you know, just not patient enough for that. But if we look at what we like, you know, we updated the table with, you know, we’ll see all the stuff today. Um, you know, like, like looking at the embedding, isn’t all that interesting.

So just that like, that’s the number of bytes that ended up in the table, uh, for the embedding. That’s what a thousand and 24 bytes, uh, thousand and 24 vectors, um, you know, measures out to bite wise. Uh, so just, just happened to be for John Skeet.

Hey, John Skeet. How you doing? But then, uh, you know, if we wanted to find similar users or maybe, you know, we’re, we’re a headhunter and we’re like, uh, I got this stack overflow job board, you know, I’m going to go find some nerds.

Uh, you could do this. Right. And you could, you could write a query. That’s like, I want to find a database performance tuning expert and you could do all this stuff. Right.

Right. And you can even call AI generate. I don’t recommend it. Don’t particularly recommend it. Right. Granted, like when you, like, uh, I don’t particularly recommend it. Um, you could even do this where you could generate an embedding directly in here.

Right. You could say, I want to like, rather than, you know, like, like, like pre, um, like, you know, and, and vectorizing this string and turning it into an embedding. You can turn it into an embedding on the fly with AI generate embeddings right in the, uh, like right in the query.

And then you can just say, well, I want to find the vector distance between, you know, the embedding that’s in the table and the string that I’m sending in. Right. You can do this.

Right. You’re, you’re allowed to, but think, think hard about doing things like this. Like this is maybe not how you want to do things, but we can run this query and we can get back, you know, for our, our very small 50 row table. Uh, we can get back some, some people.

Um, I don’t know you, I, I, I’ve talked to this guy before, GBN. Um, he’s, he’s, he’s a hoot. Um, the rest of these people, I don’t know.

Um, so, um, I, I couldn’t tell you if they’re actually database experts, but yeah. Yeah. Yeah. Yeah.

He is, he is actually a satanic sysadmin, I guess. Um, SQL Server developer Moe. All right. Well, you know, um, just choosing the, the first hundred characters of someone’s about me is maybe not always a winner. But anyway, uh, just a little bit about how you can, um, you know, use AI generate embeddings, uh, inside SQL Server.

Again, SQL Server does not generate embeddings on its own. These are all external calls. So think very, very carefully about, uh, how and where you use them in your queries.

Um, you know, if you need to generate a lot of embeddings, I would still probably want to do that external to SQL Server and then put the data in rather than have SQL Server call out generate embeddings and bring it back in. Um, in the context of a query, you know, smallish tables, you can, you can do this on the fly if you want. But, you know, some testing that I’ve done on larger, um, larger queries was, uh, even though this is just like a scalar value, it’s still not a, not a good time.

Anyway, thank you for watching. I hope you enjoyed yourselves. I hope you learned something and I will see you over in tomorrow’s video.

As always, as usual, where we will talk about, well, we’re gonna, we’re gonna, we got two more unnamed tabs up here of things to talk about. So we’ll, we’ll go and do that. Anyway, thank you for watching.

Goodbye. Thank you for watching. Goodbye.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

Get AI-Ready With Erik: Creating External Models

Get AI-Ready With Erik: Creating External Models


Summary

In this video, I delve into creating and using external models to generate embeddings directly within SQL Server 2025, Azure SQL Database, and Azure Managed Instance. While there’s minimal coverage in the full course “Get AI Ready with Erik,” which you can access using a special coupon code found in the video description, I provide deeper insights on how to integrate Python scripts for more efficient embedding generation outside of SQL Server. However, leveraging native SQL Server capabilities is incredibly convenient, especially when setting up your environment properly. I cover essential steps like enabling the External REST Endpoint and creating an external model using OLAMA or another provider. By walking through these processes, you can generate embeddings directly in T-SQL, which integrates seamlessly with transactions and triggers, making deployment simpler and more integrated than external methods.

Chapters

  • *00:00:00* – Introduction
  • *00:01:27* – External Models in SQL Server 2025
  • *00:03:06* – Benefits of Internal AI Embeddings
  • *00:04:22* – SP Configure Setting Overview
  • *00:05:37* – Creating an External Model
  • *00:08:17* – Generating Embeddings with T-SQL

Full Transcript

Erik Darling here with Darling Data, and we are going to talk about creating and using external models to generate embeddings in SQL Server 2025, Azure SQL Database, and Azure Managed Instance also. But there’s a good reason for that. And it is something that I am glad we are able to do, but it takes a little bit more work than you might be into. So, up until now, at least in these videos, there’s been a minimal coverage in the full course, right? The full course, Get AI Ready with Erik, which you can use this wonderful coupon code for, and this coupon code exists down in the video description. There’s a bit more coverage on, you know, including, including like Python scripts and, you know, all the other stuff you need in order to generate embeddings externally and then get them into SQL Server, which is, which is, to be honest, is much more efficient, right? Because there’s all sorts of things you can do external to SQL Server that are a lot faster than the things you can do inside of SQL Server. However, doing things inside of SQL Server has many conveniences, so long as you are able to prep your environment to accommodate such conveniences. So, SQL Server 2025 can generate embeddings natively. We have two new functions, like, well, you know, let’s create external model, which allows us to register a provider with which we can call these things. And we have the AI generate embeddings function, which allows us to generate vectors natively in T-SQL.

Cool. Wait, both of those things sound just dandy, don’t they? So, there are some benefits here. Like, you don’t have to do any Pythoning, which, you know, honestly, if you’re doing this in SQL Server, like, you probably just have Cloud Code write you some Python to do the other stuff, right? Like, you don’t actually have to know Python. You know, you might, you’re going to have to, like, you know, spend some time making sure the robots get stuff right, but, you know, like, the robots are good at stuff like Python, C Sharp, things like that, because, like, those languages are a lot easier for robots to deal with. SQL, they’re terrible with. There’s a lot more, or rather, there’s a lot more debugging with SQL than there is with other stuff. And they can fix the other stuff. Like, if you’re like, hey, robot, run this script, and, like, the script has, like, hits an error, they’d be like, oh, well, I can fix that.

With SQL, you’re like, hit an error, and they’re like, dog. You’re on your own. But you can do some stuff with AI generating beddings, right? Like, works in triggers, works in store procedures. It’s integrated with transactions. And the deployment is a bit simpler. In the full course, I teach you how to do all this. There are some external dependencies that you will need to do this stuff. I think, like, the biggest thing is that OLAMA is only, like, set up to do HTTP requests, not HTTPS. And this stuff requires HTTPS. So you need another program to sort of sit between, like, SQL servers calls in OLAMA to, like, get past the HTTPS requirement there.

In the full course, I teach you how to do all that. In this one, in these short videos, you don’t get the full thing because I still need people to buy the course so I can support the banality of my existence. So we’ve got the new SP configure setting called External REST Endpoint. And this will tell us whether it is enabled or not. We have, of course, enabled this because we’ve done this all before. If you need to change that, you can use, well, this command here. I’m not going to type it for you, but you can do it.

And then you will also need to create an external model. I am not going to drop and recreate mine because I like mine just fine. But this is generally what it will look like. If you were doing this just locally, it’ll look like this. If you’re using, like, OpenAI, there’s other stuff to it. But, you know, I don’t want, like, stop giving Sam Altman money.

Like, my sometimes friend Andy Pablo was posting pictures on Twitter about, like, going to the OpenAI offices. And for a company that lost $9 billion or something last year, they, they, like, if, I could lose $9 billion, too. Like, they bet on that. But don’t use OpenAI. Stop giving that man money.

But this is generally what it will look like if we wanted to do something locally. For the purpose of this course, I’m using this model. So that’s like, you don’t, you won’t use this. You’ll do something else. I’m using OLAMA to do all the processing. And I have a little local webby thing set up to do all this for me.

All right. So this is the, this is the location of the, where I send stuff to get a response back that’s a vector form, that’s a vector data type or an embedding, rather. You can see what models you have registered, right? If you run a fancy query like this, right, and you will see that I, me, I already did all this stuff. So I got all this stuff in there. Man, I did that a while ago, huh? I guess so.

But the end result, once you have done all of this stuff, is that you can generate embeddings in SQL Server. Now, if we go over here, and you can just ignore that, because I had to do something else stupid. We’ve got some other stuff running over here, right? And if we run this query, I got to select the whole thing, I suppose.

If we run this, and we go look at these, well, we should, you know, normally you see a little bit more action in here. You know, that was, that was kind of boring. I apologize. But we get back a string, right? So we said, hey, SQL Server, I want you to generate some AI embeddings for me.

And I want to know what the embedding for how do I optimize SQL Server performance would look like. And I want you to use my, the model that I set up with create external model. That’s this thing up here. And now I get back this, right?

And this is apparently the secret to the universe, is a long list of floats. Look at all these magnificent floaty beasts, these creatures of just fanciful existence. We get back 1024, well, yeah, because they don’t start on line one up here, right?

They start on line two. So because the model that I chose, like, what I tell it to do is generate a 1024 vector embedding. So that’s what I get back.

That’s why we have a, it goes down to 1025 here, but the actual numbers up here start on line two. So that’s what we get back. And this is what we can use to put into our column, columns in SQL Server where we have embeddings, right? And I define those as vector, right now, float32, because that’s what’s generally available.

Float16 is still a preview feature. But I have my vector column set up as float32 with 1024 dimension capacity, right? Or, like, exactitude or something, because I can’t put in less and I can’t put in more.

So this is the string that I get back, and it’s all set up to do the thing that I want it to do. And from there, we can do all sorts of other fun stuff. But additional fun stuff is going to have to wait until tomorrow, right?

So this is a basic overview of how we generate an embedding directly with T-SQL. Thank you for watching. I hope you enjoyed yourselves.

I hope you learned something. And I will see you in tomorrow’s video where we will continue on this path. For those of you who are getting sick of this path, I don’t have that many more videos on the subject. left to release, so your time is almost done.

All right. Thank you for watching.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

SQL Server Performance Office Hours Episode 51

SQL Server Performance Office Hours Episode 51



To ask your questions, head over here.

Summary

In this video, I delve into a series of technical questions and challenges faced by the SQL community. Erik Darling from Darling Data shares insights on computed columns, index usage, memory grants, and blocking issues—explaining why certain optimizations might not always yield expected results. The discussion covers topics like parameter sniffing, query plan stability with Query Store, and the nuances of CPU utilization after implementing dynamic SQL solutions. I also touch upon my recent endeavors in automating content creation for YouTube videos and WordPress blog posts using local LLMs, highlighting how these tools are transforming content production processes. Additionally, I announce upcoming speaking engagements at various data conferences across North America and Europe, offering attendees free access to advanced T-SQL training materials.

Chapters

Full Transcript

Erik Darling here with Darling Data, and I am in the midst of a complete psychological collapse. Windows update ruined my entire life, and then I think some sort of Streamlabs update just put the nail in the coffin. Windows update over the weekend rebooted like five things that I had important stuff going on with, and then I go to open Streamlabs this morning to record, and it’s like, you’re not logged in, you can’t record. And I’m like, I just want to save this to my computer. And they’re like, no, you have to log in to something. So I did the path of least resistance. I was like, I’ll just log in via YouTube. And it turns out that like, all my old, like, like, when I tried to be a streamer influencer and like, I don’t know, like vape and wear a cool hat and have neon lights, all those things were there, but my current recording setup wasn’t there. And so I am slowly trying to rebuild the disaster of this morning. So if anything looks weird, sounds weird, smells weird. That’s why. I don’t know. Maybe it’ll be better now. Who knows, right? Maybe it’s just like, maybe there’s too much stuff going on. Anyway, it is Monday. So we are going to office hours, we are going to have a great time. And I’m going to answer five user community, community user submitted questions. I think as long as I count correctly. And down in the video description, I’ve been having the robots slowly work through things. And I built this whole pipeline, streamline process to download my YouTube, my entire YouTube catalog. And use local LLMs to create summaries, chapters and transcripts. So you should start seeing the summaries and chapters hit the YouTube videos, there’s some hiccups there with the YouTube API rate limiting me. But then all the WordPress blog posts should have transcripts in them. And then moving forward, you’ll see that in all of the new videos, which includes this one. So that’s been fun. But down below the brand new robot created and robot quality summaries and chapters, you will find all sorts of helpful links with which you can hire me in various forms and fashions.

Yeah, you can you can you can you can consult with me, I’ve redone the training and consulting website with all sorts of spicy new offerings to appeal to a broader variety, broader spectrum of people. I’ve prettified all the training stuff. So it’s easier to sort of get to everything you want. And of course, you can do all the usual things where you you know, if you want to become a, you know, helpful member and if there’s questions, there’s questions, you can do that. And as always, please do like subscribe, tell a friend, tell your best friend in the world. Tell a family member, maybe just tell someone who you want to see someone new, you know, hey, are you also having a complete psychological breakdown because of Windows update and stream labs? Get in here. And I feel like this, this guy could really help you through. So with that out of the way, I will be leaving and this one is also starting. Data to Nashville in March, data Saturday, Chicago in March, SQL in SQL day in Poland, and data Saturday in Croatia in May and June, respectively. I will be teaching advanced T SQL pre cons at all of them. Attendees to these get free access to the entire course material. So if you’re in the area, you should get in my area and learn some T SQL and skip work that day. And I’ll say you were there. I know I’ll validate your parking. But until then, until the database marshmallows have defrosted and thawed themselves. Does defawing mean freezing? Right? Because thawing would be Yeah. Anyway, these these poor poor things look at them. Well, we’re all muddling through, aren’t we? Anyway, let’s answer some questions over here. Because that’s what we came to do. I created a computed column with a specific formula. That’s nice. I prefer I prefer computed columns made with breast milk. But you know, formulas cool too. To match a common where clause pattern added an index on it. You’re checking some boxes here, friend. Good for you. And the statistics show good histogram data. Well, is the first time for everything. But the optimizer still expands the formula and does a scan instead of seeking on my computed column index. I’m not using any functions that would prevent matching. What causes the optimizer to ignore a computed column that perfectly matches the expression? Well, the only way to, you know, you know, guarantee that your computer, your computer, your computed column will get used is to reference the computed column directly and not rely on expression matching. Our beloved cost based SQL Server optimizer expands all any expressions. Any expressions that it finds that is computed columns, index views, stuff like that. So, you know, you might be in a situation where perhaps the index you created on the computed column was a little bit too narrow for the remainder of your query. Perhaps SQL Server is stuck with a choice between, you know, a nonclustered index thing and a key lookup versus a clustered index scan.

And perhaps that’s why your computed column is not being used. That would be my first guess. But, you know, generally, if we want SQL Server to use something, we ought to be explicit about it. You know, if you like, you know, it’s nice when expression matching works, but, you know, like you usually do that when there’s a third party vendor app that you have no control over. If you’re in control of the queries, you ought to just use the computed column directly because you know what’s best there. It’s sort of like with index views. You know, it’s like if you get mad at the optimizer for not matching an index view or for expanding an index view that you do reference directly, put a no expand hint on it.

I think I know what I’m doing over here. But yeah, I think one thing that you should certainly do is run the query with an index hint telling it which index to use and just see what the execution plan looks like. Because often that will tell you why SQL Server decided not to use the query plan that you were hoping it would. Usually it will be a cost-based decision, right? You might be able to figure something out there.

Let’s see. This query. Oh, this query, huh? This query spills about two gigs to 10 dB. Wow, that’s a lot of spilling. But the estimated memory grant looks reasonable. Well, you know, there’s the memory grant that you think you need and then there’s the memory grant that you actually need, right? It’s like retirement. Like, ah, I can retire on this much money. And then you retire on that much money. You’re like, wait a minute. I can’t do anything.

Why would SQL Server underestimate memory like this? Variety of reasons. And when does memory grant feedback fail to fix it? Well, you might actually be getting memory grant feedback and you might have a parameter sensitivity situation. Now, imagine that you run your query the first time and let’s just say it gets like a 3 gig memory grant, right? And everything goes fine.

And then the query runs like 10 times after that. But it’s for a really small amount of data now, right? Because it was a big amount of data. 3 gig memory grant. Yeah, we got it. Now it’s like run like 10 times for a small amount of data. And memory grant feedback might have adjusted your memory grant way down. And then if you ran it again for the big one, your memory grant had been adjusted down so far.

And now SQL Server’s like, wait a minute. I needed those two gigs. I forgot. Ah. Like, you know, it’s like on your way to the airport and you’re like, I left my wallet at home. It might have been trying to fix something else and then broken that.

But, you know, sometimes, you know, the memory grant that you asked for, you know, sometimes it can be shared, sometimes not. So it’s possible that, you know, you like the memory grant that you got in entirety was was reasonable. But then the memory, the memory fraction that one of your other memory consuming query plan operators got was was not very good.

So there are a number of things to look at in there. You know, I think that primarily if it is a parameter sensitivity situation, you should do as much indexing or query rewriting as you can to alleviate the need for memory consuming operators where possible. Specifically, sorts are usually most prone to this sort of issue.

Of course, various hashes might spill as well, but sorts usually cause quite a bit of ruckus. So that, you know, as far as like why SQL Server would underestimate memory, you know, all the typical underestimation scenarios apply here. You know, without seeing more of the query or understanding more about, you know, what’s going on with it.

It would be an exhausting time trying to list it all, but, you know, local variable misestimates, parameter sensitivity, you know, big plan, little plan. Oh, this one’s for three rows. This one’s for three billion rows.

You know, the table variables often cause misestimates. All sorts of things might be kicking in. So, I don’t know. There’s always recompile hints, right? Always recompile hints.

Query store shows a stable plan and low average duration. Oh, average, you say. But users still report random slowness.

What does query store not tell us that could help explain this? Well, it’s maybe not what query store doesn’t tell you. And even the query store GUI is not guilty of this.

But, my friend, you’re telling me about low averages, but have you looked at mins and maxes? Because average is nice, but mins and maxes can show you outliers in data that averages tend to lose. So, if you look at the average and it’s like, let’s say, 100 milliseconds.

And you look at the min and it’s like 10 milliseconds. But then you look at the max and it’s like 15 seconds. Well, you’ve got a bit of investigating to do.

This, again, could be a parameter sensitivity situation. This could be a blocking situation. If you look at the max CPU and the max duration, and the max CPU is still pretty low, but the max duration is still pretty high, then there’s something else going on, right?

There’s something else amok with this query, right? It could be getting blocked. It could be waiting on other resources. Sometimes there’s all sorts of things that you must investigate and uncover. And, as always, my rates are reasonable.

So, if you’re out there listening, boy, boy, boy, boy, can I investigate. I fixed parameter sniffing with Dynamic SQL. High five.

You and me, you and me, drinks anytime you want. But now CPU is higher overall. Did I actually improve performance or just change failure mode? I don’t know.

How high is CPU? Like, what’s higher, right? Is it, like, did it go from, like, 5% to 7%? Did it go from 5% to 15%?

In either case. So, here’s the thing with CPU being higher overall. Up to a point, that doesn’t tend to bother me. There’s a lot of people in the world who will brag, who will put on, like, a top hat and wear, like, fancy ribbons and tell you that their SQL Server CPU was constantly at 10%.

And to them, I say, why are you overpaying Microsoft? Why is that good? You out of your mind?

Like, how many cores do you have? You know, it’s like, what, $2,000 to $7,000 a core? You’re only using 10% of that? 90% of your money to Microsoft is wasted.

So, you know, parameter sniffing is one of those things where, you know, it can go in the direction of, you know, you have, like, a tiny little serial plan. And it blows up when it has to process a lot of rows. And you can also have a situation where, like, you know, you get, like, a big parallel, like, hash join crazy plan.

And all of a sudden, you start running out of server resources. Like, you might start waiting on memory via resource semaphore. Or you might start waiting on CPU via thread pool.

So, there are two ways that can go. But, you know, you know, I think that when, you know, I think you’ve balanced things a bit, right? Because you are probably getting, like, easy little serial plans where appropriate.

And you are probably getting larger parallel plans where appropriate. And things are just sort of balanced out a bit. So, you have, like, queries that are processing more data using more CPUs, right?

They’re going parallel. They are using DOP CPUs and doing things. And you have your smaller single-threaded queries off doing their thing. But most likely, you have improved the situation overall for the people who now get parallel plans and use more CPU and push CPU higher because they are probably no longer waiting a very, very long time for a very, very slow serial execution plan to finish.

So, I think that you have done a good thing. I think that you have done a smart and reasonable thing. And I think that you should probably not worry about, unless CPU, unless by, like, CPU is higher overall, you mean, like, now CPU is constantly at, like, 95%, you’ve probably done a smart thing.

All right. SP, who is active? I know that guy.

He owes me money. He owes short blocking chains, but users still experience big delays. How can blocking be minimal, but user latency still be high? That’s a dense one.

It sets the mind wandering in several different directions. You know, blocking is not the only thing that causes delays. If you want to continue to be database focused, you know, like, you know, the, so, like, I don’t know, like, to me, I’m thinking, like, short blocking chain, like, like, is it like 20 queries pile up for, I don’t know, like, 10 milliseconds, 20 milliseconds, three seconds?

I don’t know, like, I don’t know, I’m just having a hard time, like, like, like, okay, like, SB, who is active shows short blocking chains, but, like, man, there’s so many other things that can slow a query down, right? Like, like, like, like, like, if you stay in the database, you know, those short blocking chains might not be the things that users complain about, right? There’s all sorts of other things that might be slow.

And if you, and if you, once you get outside the database, you know, like, if you start thinking about your app servers, and, you know, what they do when they receive data, and all sorts of other stuff, you know, what you want to do is, you know, SB, who is active, is great at showing you what’s happening in the moment. But perhaps you need to broaden your view a little bit, perhaps take a step back from just what’s happening right now, and look at the server and a little bit more overall, you know, if your server has been up for a reasonable amount of time, and by that, I mean, like, Goldilocks zone amount of time, like, if it’s been up for like 100 or so hours, looking at aggregated wait stats can be very, very useful. If, you know, if, you know, you are, been up for 3000 hours, and the wait stats picture becomes a bit muddy, you could, you could try, you know, persisting wait stats and other things off to, you know, tables with a little timestamp in there and, you know, figuring out sort of, you know, like, like, you know, like every five, 10 minutes or so, like, you know, like how wait stats are changing.

And you could probably, you know, figure out a lot more from that than just running SP who is active quickly in the moment. If you’ve got query store on there, you can always look in there for, you know, you can use my store, not the GUI, because the GUI is an atrocity. GUI is something that you would design if you didn’t like someone.

You can use my store procedure, SP quickie store, you can find, you know, queries with, that are slow, that happen during periods of time when users are typically complaining about things. That’s sort of what I’d go for there. You might, you might even want to look at, you know, capturing the block process and or XML deadlock report and see how much action there is in there.

Because you might, you might be surprised that, you know, the, the short blocking change that you see with SP who is active might just be part of the picture. They might not be the whole picture. There might be a whole lot more going on in your server that, that you could, you could delve into and that you could use to complete your analysis a little bit, a little bit more thoughtfully.

Anyway, that’s probably good there. Uh, that wraps up the 51st episode of Office Hours. Um, as promised, I said absolutely nothing and made no fanfare about episode 50 because it’s not that big a deal.

Like episode 100, maybe. I’ll wear like a hat or something, like a party hat. Maybe I’ll, maybe I’ll, maybe I’ll live stream episode 100 because I have all this stuff now I can do.

Anyway, that’s cool. Thank you for watching. I hope you enjoyed yourselves. I hope you learned something. And I will see you in tomorrow’s video where we will, we will talk more about, um, our, our new AI overlords and SQL Server. Anyway, thank you and goodbye.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

Get AI-Ready With Erik: Vector Index Preview Feature Limitations

Get AI-Ready With Erik: Vector Index Preview Feature Limitations


Summary

In this video, I delve into the limitations of vector index preview in SQL Server 2025, sharing insights from my course “Get AI Ready with Erik.” I highlight key constraints such as read-only tables, restrictions on inserts and updates, and the necessity for an integer primary key. Additionally, I discuss practical strategies for managing data across multiple tables to accommodate these limitations, emphasizing the importance of planning ahead to ensure smooth operations when using vector indexes in your database environment.

Topics

`SQL Server 2025`, `Vector Indexes`, `Preview Feature`, `Limitations`, `AI in SQL Server`, `Data Modification`, `Read-Only Tables`, `Partitioning`, `Replication`, `Integer Primary Key`, `Clustered Index`, `Embeddings`, `Azure SQL Database`, `Stale Vector Index`, `Disabling Vector Indexes`, `Table Management`, `Vector Search`, `Vector Distance`, `Roadmap`, `SQL Server 2025 Features`

Chapters

  • *00:00:00* – Introduction
  • *00:00:30* – Vector Index Preview Limitations
  • *00:01:00* – No Inserts, Updates or Deletes Allowed
  • *00:01:30* – Table Read Only
  • *00:02:00* – Partitioning Not Supported
  • *00:02:46* – Replication to Subscribers Not Supported
  • *00:03:08* – Integer Primary Key Required
  • *00:03:48* – Azure SQL Database Regions
  • *00:04:15* – Disabling Vector Indexes
  • *00:04:49* – Managing Data with Vector Indexes
  • *00:05:22* – Combining Searches
  • *00:07:17* – Stone Age of Vector Indexes
  • *00:08:33* – Microsoft’s Priorities

Full Transcript

Erik Darling here with Darling Data. Here to try to continue to educate you about AI and the vector stuff in SQL Server 2025. This is all, you know, small snippets of material from my course, Get AI Ready with Erik. You can buy that for a hundred bucks off with the coupon code up here. And when you do, you support me doing menial things in life, paying rent, cable bills, cell phone bills, nothing fun, right? No joy, no joy, just AI. But we’re going to talk about vector index preview feature limitations in this one. And because they are significant, and it’s hard to imagine, like, it’s hard to imagine releasing them in the current state as generally available. Like, these are just, this is just the stuff that we know about. This is not all the stuff behind the scenes that, you know, probably still needs tidying up, fixing all the other stuff. But just to sort of zoom in on some other things. When you create a vector index on a table, there are no inserts, updates, or deletes allowed. This is probably the number one reason that vector indexes are preview only. You know, like, we didn’t have the preview feature switch when columnstore indexes came out and like, you know, they were like, oh, these make the table read only two. And we didn’t have that. So if we did, maybe that would have been something. But if you try to do anything, you will get this funny error data modification statement failed, because table post embeddings has a vector index on it. So no inserts, updates or deletes from you. So just from the Microsoft docs, you know, obviously, tables read only, so that stinks. You can’t, like you can’t partition tables at vector indexes on. I don’t know if I care so much about this, right? Like, partitioning is a data management feature that everyone seems to confuse for a performance feature. And yeah, I just I just don’t care. You know, if it honestly, if that kept happening, and people were just stopped asking questions about partitioning their tables, I’d be I’d be thrilled. Another one that I kind of don’t care about no replication to subscribers, replication. This is like capital R replication, not like availability group, like, you know, log shipping, mirroring, stuff. This is like, you know, merge or transactional replication, you can’t replicate vector indexes to subscribers. Again, if this gets people away from replication, all in favor of it, right? Yeah, what a nightmare.

An integer primary key is required. So this is this is like for us like a single column. So like I think this one is probably the funniest one to me. Because Microsoft is almost like self limiting you in this way. Like, like, like, like, you can’t use like a big int, right? Like anyone who’s table like tables like are they’re like, wow, this table is going to get big, we need a big int for this. Microsoft’s almost like 2 billion rule limit, pass this, I can’t make any promises, right? Like that’s, that’s, that’s a funny one, right? So it has to be an integer has to that has to have a cluster primary key on just the integer. You can’t have any composite keys. You can’t like, you know, have your cluster primary key on a GUID or something like that.

Or something like that has to be the integer. That’s an amusing one. I don’t know the story behind that. But you know, maybe maybe they’re trying to save you from yourself. And of course, if you drop the index and you insert data into the table, you have to rebuild it if you want to start using it again, that one’s fairly straightforward. Azure SQL database, at least in some regions, where the vector, the disk and vector indexes have rolled out to they, as of this, as of this date, not all Azure regions support this.

They do have a setting called allow stale vector index that allows rights, but the index will be stale until you rebuild it. This is not available in SQL Server 2025. You can’t currently disable a vector index, right? You have to drop it, right? If you try to disable the vector index, you will get this error that says one or more of the specified alter index options is unsupported for a vector index. So no disabling vector indexes. So if you have a system where you need to add data regularly, where like, you know, to a table, you’ll probably want to keep embeddings in a separate table from like, you know, your transactional stuff where people need to do things so that you may continue to make money off them.

You know, you might even set it up. So like, you have two tables. One of them is for, you know, like rows that you are actively, you know, like have like an active vector index on them. And another table that gets like new records that you have to like, you know, that you haven’t put into the vector search table yet, because you have to sort of like batch that, right? You would have to set up a process where, you know, like it’s some cadence that makes sense to you, you would have to drop the vector index.

You know, like either, you know, you’ve already either generate embeddings for new records or already have them like in like a second table in like put the new put the like newer embeddings into the vector into the table that you want to have the vector index on and then rebuild the vector index. Then you could, you know, you could combine searches like, like one, like, you know, like, so like a union all query where one of them does the vector search, the other one does the vector distance search on the unindexed data. So the downside for that is if you are like, if you have like sort of like hot vector data, you know, like new records wouldn’t be searchable with vector search until you, until you like batch them into the new table, you would have to continue to use like the vector distance to do that.

But the general pattern that you would want to follow is combine them. Right. So almost like, you know, like some people like some people will pretty commonly have like, like, like, like, not really like partitioned with capital P partitioned, but they’ll have like sort of like an archival table that has like much older stuff in it. And they’ll have a newer table for like newer data, maybe even like three tables where it’s like, you know, you have like archived and then like, you know, like lukewarm data and then like hot data.

And you’ll just like union all those together with like constraints on each so that like if there’s a date based search, SQL Server knows which one to go to. But you could you could do something sort of similar with your like, you know, not archived, but like indexed vector data and your new vector data where you just sort of union all two things. And just like, well, this I can vector search this I need to vector distance because I don’t have an index on that yet.

And I can’t use vector search without a vector index. And I can’t just put the data directly into the table with the vector index on it because the vector index makes a table read only. You would just have to sort of find a way to combine both of those, which, you know, right, right now is not a great story.

And that’s why vector indexes are in preview. They’re not, you know, they’re not in the generally available category because they’re sort of a nightmare, right? Like it makes a lot of processes not fun to use.

There’s a lot of feature interoperability that just isn’t great with stuff. So, like, I get it. I get why Microsoft made the choice.

But, you know, you know, vector indexes will probably be nice someday. But for now, like we’re looking at like the stone age of vector indexes in SQL Server. And, you know, I think what’s what’s really depressing about it, you know, from like someone who cares about the database product quite a bit is there’s like no roadmap.

There’s no timeline. There’s no like, you know, it’s just sort of like, oh, we’ll get to it when we get to it. Like, stay away.

It’s, you know, you can’t get an answer from anyone on anything. And, you know, you can’t help but feel that Microsoft’s priorities in other areas are taken away from what what was trotted out by all sorts of folks at the company to be like the flagship feature of SQL Server 2025. Whenever everything that we saw, like the word SQL Server was barely in like the marketing header.

It was like it’s like like, yeah, it’s like SQL Server 2025. But it was like fabric, AI, ground to cloud to whatever. And you’re like, OK, well, where’s where’s the database?

And then didn’t we? Isn’t that what we care about? Isn’t that what we’re here for? So I don’t know. It would be cool if, you know, we had people were a little bit more transparent about when these things might stop sucking. But for now, we must embrace the suck for the suck has embraced the warm embrace of suckiness is what we have.

Anyway, that’s probably enough here. Thank you for watching. I hope you enjoyed yourselves.

I hope you learned something and I will see you. Oh, I don’t know when I when I see you. Who knows? Maybe I’ll just quit. Maybe that’s enough of this. All right.

Thank you for watching.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

Introducing Performance Studio: A Cross-Platform Environment for SQL Server Query Plan Analysis


Performance Studio

Stop clicking through SSMS execution plans like it’s 2005.
Performance Studio is a free, open-source plan analyzer that tells you what’s wrong,
where it’s wrong, and how bad it is — from the command line, a desktop GUI,
an SSMS extension, or an AI assistant.

Built by someone who has stared at more execution plans than any reasonable person should.

Windows x64 • macOS (Apple Silicon & Intel) • Linux x64 • .NET 8 • MIT License

Performance Studio — Actual Execution Plan with Plan Insights

Need Expert Help?

Darling Data offers SQL Server performance consulting and training.

Learn More

Who This Is For

If you’ve ever said “I think it needs an index” while pointing vaguely at a Hash Match,
or if your idea of plan analysis is “that arrow looks kind of wide,” this tool is for you.

🧑‍💻

DBAs Who Tune for a Living

You already know what a Hash Match spill is. You just want something that finds them faster than scrolling through plan XML in Notepad++.

🔬

Developers Who Suspect the Database

Your app is slow. You think it’s the queries. Run them through Performance Studio and find out in seconds instead of hours.

💼

Consultants & Contractors

Batch-analyze a client’s worst queries, generate text or JSON reports, and look like a genius before your first meeting ends.

This Is Not a Monitoring Tool

Performance Studio analyzes execution plans. It doesn’t collect wait stats over time, monitor CPU trends,
or send you email alerts at 3 AM. If you want that, check out
SQL Server Performance Monitor — also free, also open source.

What It Catches

Feed it a query plan and it tells you what’s wrong — with severity levels,
operator node IDs, and enough context to act immediately.

💥

Memory Grant Issues

Flags queries granted 8 GB of memory that used 200 MB. Also catches hash, sort, and exchange spills to TempDB with severity based on volume.

📊

Row Estimate Mismatches

Finds operators where the optimizer estimated 1 row and got 2,889. Because “roughly one” and “almost three thousand” are not the same number.

🔍

Missing Indexes

Extracts SQL Server’s index suggestions with ready-to-run CREATE INDEX statements and impact percentages. Copy, paste, done.

🎯

Parameter Sniffing

Compares compiled vs. runtime parameter values so you can see exactly when the optimizer made a plan for the wrong data.

🔀

Parallelism Problems

Detects thread skew — one thread doing all the work while seven others sit idle — plus ineffective parallelism and serial plan reasons.

🚫

Anti-Patterns & More

OPTIMIZE FOR UNKNOWN, NOT IN with nullable columns, leading wildcards, implicit conversions, scalar UDFs, key lookups, late filters, nested loop concerns, and more.

30 analysis rules run against every plan — covering memory, estimates, indexes, parallelism, joins, filters,
functions, parameters, compilation issues, and common T-SQL anti-patterns.
Each rule can be disabled or have its severity overridden via config file.

Four Ways to Use It

Performance Studio meets you where you are — whether that’s a terminal, a GUI,
SSMS, or a conversation with an AI.

Command Line

CLI Tool

Analyze saved .sqlplan files, capture plans from live servers, or batch-process entire folders of queries. JSON output for automation, text output for humans.

## Analyze a saved plan
planview analyze my_query.sqlplan –output text

## Capture from a live server
planview analyze –server sql2022 –database AdventureWorks \
–query “SELECT * FROM Sales.SalesOrderHeader”

## Batch-process a folder
planview analyze ./queries/ –output-dir ./results/

Desktop App

Graphical Plan Viewer

SSMS-style operator icons, cost percentages, row counts, and warning badges. Click any operator for full properties. Zoom and pan. Dark theme, because you have taste.

  • Plan Insights — runtime summary, missing indexes, wait stats at a glance
  • Plan Comparison — side-by-side cost, runtime, I/O, and memory diffs
  • Query Store — fetch top queries by CPU, duration, reads, or memory
  • Copy Repro Script — extracts parameters, SET options, query text into runnable sp_executesql
  • Advice for Humans — one-click text report you can read or paste into a ticket
SSMS Extension

Right-Click → Analyze

Adds “Open in Performance Studio” to the execution plan context menu in SSMS 18–22. Right-click a plan, click the button, and the full analyzer opens with your plan loaded. No exporting to XML. No file dialogs.

Auto-detects SSMS 21 and 22. Installs into both. Path is saved so you only configure once.

AI Integration

MCP Server Built-In

Built-in Model Context Protocol server with 13 tools for plan analysis and Query Store data. Point Claude Code or Cursor at it and ask questions in plain English.

Analyze Plans Compare Plans
Missing Indexes Query Store Top N

Localhost only. Disabled by default. We’re paranoid so you don’t have to be.

What the Output Looks Like

Real output from a query against StackOverflow2013 on SQL Server 2022.
Every warning tells you what’s wrong, where it is, and how bad it is.

The CLI produces text for humans and JSON for automation.
The GUI shows the same data in a graphical plan tree with clickable operators.
Both give you missing index CREATE statements you can run immediately.

Batch mode generates three files per query: the raw .sqlplan XML, a structured JSON analysis, and a human-readable text report.

Plan: 04_comment_heavy_posts.sqlplan
Runtime: 4551ms elapsed, 15049ms CPU
Memory: 8,022,664 KB granted, 2,514,944 KB used

Warnings:
[Critical] Large Memory Grant: 7835 MB granted

Operator warnings:
[Critical] Parallelism (Node 0): Est 1, actual 2,889
[Critical] Sort (Node 1): Est 1, actual 2,889
[Warning] Sort: Thread 1 processed 100% of rows
[Warning] Filter (Node 2): Late filter

Missing indexes:
dbo.Posts (PostTypeId) INCLUDE (Score, Title)
Impact: 74%
dbo.Comments (PostId)
Impact: 19%

Summary: 8 warnings (4 critical), 2 missing indexes

See It In Action

Query Editor with syntax highlighting

Query Editor — syntax highlighting, SQL completion, one-click plan capture

Side-by-side plan comparison

Plan Comparison — side-by-side cost, runtime, I/O, memory, and wait stat diffs

Advice for Humans text report

Advice for Humans — one-click text report ready to read or share

Query Store integration

Query Store — fetch top queries by CPU, duration, reads, or memory

30 Analysis Rules

Every rule includes severity (Info, Warning, Critical), the operator node ID,
and enough context to act on immediately.

Memory & Spills

  • Large memory grants
  • Grant vs. used ratio
  • Hash & sort spills
  • Exchange spills

Estimates & Stats

  • Row estimate mismatches (10x+)
  • Zero-row actuals
  • Row goals
  • Compilation issues

Indexes & Lookups

  • Missing index suggestions
  • Key & RID lookups
  • Scan with residual predicates
  • Implicit conversions

Patterns & Smells

  • Parameter sniffing
  • Scalar UDFs (T-SQL & CLR)
  • OPTIMIZE FOR UNKNOWN
  • Leading wildcards, late filters

Platform Support

Downloads are self-contained. No .NET SDK required. Extract the zip and run.

Feature Windows macOS Linux
Desktop GUI
CLI (planview)
SSMS Extension ✓ (SSMS 18–22)
Credential Store ✓ Credential Manager ✓ Apple Keychain Coming soon
MCP Server
Query Store Integration

Getting Started

Download & Run

  1. 1

    Grab the zip for your platform from GitHub Releases
  2. 2

    Extract the zip. No installer. No .NET SDK. Self-contained.
  3. 3

    Open a .sqlplan file or connect to a server
  4. 4

    Read the warnings. Fix the queries. Look like a hero.

CLI Quick Start

  1. 1

    planview analyze my_plan.sqlplan
  2. 2

    Add --output text for human-readable output
  3. 3

    Use --server and --query to capture live plans
  4. 4

    Store creds securely: planview credential add my-server --user sa

GitHub Sponsors

If this tool saves you time, consider sponsoring development on GitHub.

Become a Sponsor

Need Expert Help?

Darling Data offers SQL Server performance consulting and training.

Learn More

Ready to Stop Guessing?

Download Performance Studio and start understanding your execution plans today.
Free, open source, and built by someone who actually tunes queries for a living.

Download Performance Studio

Get AI-Ready With Erik: Vector Index Intricacies

Get AI-Ready With Erik: Vector Index Intricacies


Summary

In this video, I delve into the fascinating world of vector indexes in SQL Server, a topic that might seem a bit dry but is incredibly powerful for certain types of data analysis and search operations. Vector indexes are fundamentally different from traditional B-tree indexes; they create a graph where each vector (a series of floating point numbers) becomes a node, and edges connect similar vectors. This unique structure allows for faster searches by navigating the graph rather than scanning every single vector, making it particularly useful for tasks like content similarity search. I walk you through how vector search works using SQL Server’s preview function, demonstrating its efficiency compared to exact vector distance calculations. Along the way, we explore concepts like recall and see firsthand the trade-offs between exact and approximate searches. To give you a practical example, I run a demo comparing the results of an exact search with those from vector search on a sample dataset, highlighting both their similarities and differences. If you’re curious about diving deeper into AI and SQL Server 2025, be sure to check out my course “Get AI Ready with Erik,” where you can learn more advanced techniques and get hands-on experience.

Topics

`SQL Server`, `Vector Indexes`, `B-Tree Indexes`, `Columnstore Indexes`, `Graph Search`, `Greedy Search Algorithm`, `Vector Distance`, `Recall (Information Retrieval)`, `AI in SQL Server`, `Post Embeddings`, `Docker Container Networking`, `Exact Search`, `Approximate Search`, `Vector Search Function`, `SQL Server 2025`

Chapters

  • *00:00:00* – Introduction
  • *00:00:31* – Vector Indexes vs Columnstore
  • *00:01:02* – Graph-Based Structure of Vector Indexes
  • *00:01:50* – Searching the Graph
  • *00:02:19* – Greedy Search Algorithm
  • *00:03:08* – Life Metaphor for Search Strategy
  • *00:03:47* – Vector Search Function Overview
  • *00:04:32* – Efficiency of Vector Search
  • *00:05:12* – Non-Sargable Predicate Comparison
  • *00:06:03* – Recall in Approximate Searches
  • *00:07:28* – Critique of Microsoft’s Efforts
  • *00:08:45* – Exact vs Vector Search Example

Full Transcript

Erik Darling here with Darling Data, here to talk to you about some boring stuff about vector indexes, because they’re not like regular indexes, right? It’s not like a B-tree index at all. It’s a completely different structure, which is, you know, why, I believe, much in the way that, you know, columnstore, well, actually, no, I’m lying. columnstore is much closer to a normal index than a vector index is, but the only reason that I bring that up is obviously because when Microsoft first released columnstore indexes, they struggled mightily with the columnstore index, making the table that it was created on read-only. We’re going to talk more about this other stuff, but currently, vector indexes do that, but vector indexes are not like B-tree indexes in that the way that, data is searched and sort of, you know, written out is a lot different, right? It’s not a B-tree where you have pages just sort of linking to each other and you can seek and, you know, do all this other stuff within it. It’s a completely different sort of structure, which is also probably why Microsoft is struggling so mightily with getting the creation of them to be fast, because…

Anyway, let’s get on with things here. Disk and indexes build a graph, basically, where each vector, right, each vector that you have, those are that, you know, series of floating point numbers, is a node in the graph, and then edges connect sort of similar vectors. So you can, you can sort of seek from, like, you know, like, within that, or like, seek around that graph, but it’s not, it’s not really the same thing.

Whenever you search, you navigate the graph instead of scanning everything. So like that, like, like using the vector search function, which is in preview, you know, like, you can, you can seek within that rather than like, you know, like, like a vector distance, you have to basically like scan everything, measure the distance and then, like spit out whatever, you know, like the distance there and like any filtering that you apply or any ordering that you do as a result of that sort of runtime calculation. It’s not stored anywhere.

But vector, vector search uses, or vector indexes use sort of a greedy search, which is a problem solving strategy that sort of, like, I guess like the premise of it is, like, if you make the best local choice, every time you connect, like similar vectors, like if you make the best local choice at each step, then the hope is that it will lead to the best sort of like global solution. So it’s sort of like, if you do everything right in life, I mean, look, you’re still going to die, but maybe some good stuff will happen to you along the way.

You know, more, more likely, you know, you’re going to watch like crappy musicians get rich and famous and like terrible actors make millions and millions of dollars and you’re just going to, you know, work your butt off and have to watch training videos about AI and SQL Server. But the search algorithm that gets used is called the greedy search, right? I’m always searching for new ways to be greedier because I hear that the greedier you are, the more money you make.

So I’m always just trying to figure out how can I be greedier? So far, it hasn’t worked, right? But you start at an entry point and you look at like, like you get to a node and you’re like, well, like, like this is like, like, like fan out and look at, look at all the neighbors of that node.

And then you move to whatever neighbor is the closest to your query, right? So like, if you have like, like an 0.5 here, you’d be like, well, what are the closest to this? Like 0.6, 0.7, 0.8, 0.9, you’re like, ah, 0.6 is the closest I’ll go to you.

And then you kind of repeat until you don’t find a closer neighbor. At which point you might backtrack and try alternative paths to see if you turn something, if you find something better, find a better path through the graph. And then you return the best candidates that showed up in there.

What it’s faster because you don’t examine every single vector, right? Like when you use vector search, like the function vector search, SQL Server, like is able to sort of like look at a vector and figure out what is closest to it and like move to that rather than just like, you know, like running like the vector distance on everything, figuring out what that distance is and then going on with it. You can think of like the vector searching as always being like a non-sargable predicate where like if you were to say like date diff, like one column, like date different days between two columns and like is greater than four.

SQL Server doesn’t know any of that ahead of time. It has to run that function for every row that you want to compare, like figure out what the difference in days between two date columns is. And then it can figure out like if it meets that, if those rows meet that predicate, you can’t do it.

Like it doesn’t know any of that ahead of time unless you create a, create a computed column and do all the other stuff. So vector search is faster because the graph will guide the search to other relevant things that might, that are similar to it. Search time with vector distance, like I showed you in another video, gets slower as your data gets bigger because you have more things to compare and figure out the distance between.

In the AI world, there’s a concept of recall. Recall. And recall is the, like what fraction of true neighbors did an approximate search.

Fine. Because vector search is an approximate search, whereas vector distance is like an act, like an exact search. So you can think of that as sort of like if an exact search, like went through all of the neighbors and it found like ABCDEFGHIJ.

Like a, like a disc and, like a vector index search would, might find like ABCDEFGHXY. Recall is like how much of the sort of approximate search, like matches what an exact search would find. Uh, so in, in that, that case up above where only like the last two are different, IJ versus XY, the recall would be 80%.

Um, Microsoft research reports 95% plus recall on billion point benchmarks using their disc ANN indexes. Which, you know, some, I mean, it’s, it’s good, right? 95% plus.

Great. You know, it’s just, you, you wish that they were generally available. You wish they didn’t make the tables you create them on read only. You would, might even wish that creating them didn’t take the gargantuan effort that it does. You might even say, I don’t know, maybe, maybe put the fabric down and dedicate some engineers to this thing that seems important.

I don’t know. Right? Stop, stop fussing about with these gag gifts to the world.

Like, no one needs fabric. We have Databricks, we have Snowflake, we have other things that already do this job. Right? Showing up late to the party with your pants off.

Anyway, uh, let’s give ourselves a single query vector. And that single query vector is going to represent the search phrase, Docker container networking. Right?

So this is the vector that we care about. Uh, using exact search, right? With this vector distance function. We’re going to find the top 20 rows, uh, that, um, that have the, the closest distance, right? So low, again, lower number better.

We’re gonna find the top 20 closest matches to, uh, to, uh, Docker, Docker container networking in the post embeddings table. And then we’re going to use vector search down here. And I’m going to talk more about vector search, but we’re going to use vector search here to, um, see how close or see how much we get on that.

Cause from this one, right, we’ve got the top 20 by exact search here. We’re going to get the top 20. That’s this thing here.

Top n equals 20. Uh, so we’re going to get the top 20 rows from this, uh, that come out of this function. Right? So this is only going to return 20 rows. And since we’re dumping it into a temp table, we don’t need to worry about that one. This one, we were saying, give us the top 20 ordered by like, which ones are the closest neighbors.

So using vector search, we can do sort of the same thing, right? Where we hit the post embeddings table. We look at the embedding, uh, we tell, or rather we tell it which column to use here.

Uh, for some reason you can’t alias this thing. Um, we’re going to, we’re going to say, we want it to be similar to the vector embedding that we found before using the cosine metric and give us the top 20 rows from that. And if we look at, uh, what came back from those, what we’re going to see is of course, the exact, well, I mean, I say of course, but I say of course, because I’ve done this demo before, but, um, you know, both of these things found 20 rows, right?

But the only 16 of those rows overlapped, meaning that, you know, the, like there, there is a difference in the search results between the exact search and what vector search found. If you want to find out what that difference is, I would highly recommend you buy my course, Get AI Ready with Erik, which if you use this coupon, we’ll buy you a hundred, a hundred dollars off. The, the, the price of admission that, that link is down in the video description.

You can click on this fully assembled pre pre-made link for you and you can, and you can buy it and you can, you can learn all sorts of additional things about AI and SQL Server 2025. All right. Thank you for watching.

I hope you enjoyed yourselves. I hope you learned something and I will see you in tomorrow’s video where we will do, oh, I don’t know, something equally vector-y and search-y. All right. Goodbye.

 

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

What Do You Call A Join That Isn’t A Semi Join?

What Do You Call A Join That Isn’t A Semi Join?


There’s a gap in SQL Server database terminology that’s been bugging me as long as I’ve been teaching people about databases.

You’ve got semi joins, and you’ve got anti-semi joins.

Everyone (loosely, very loosely, everyone) knows what those do: Find a match, or confirm there isn’t one. Lemon-squeezey.

The engine doesn’t need to fully articulate the relationship if you have a one-to-many or many-to-many relationship between two tables.

Proof of existence (or non-existence) is enough. Short-circuit, move on. Once and one time only, to biblically know me, in the back of my go-cart.

Then you’ve got INNER JOIN, LEFT JOIN, RIGHT JOIN, FULL OUTER JOIN. Further, there’s SQL Server’s APPLY, and the ANSI-friendly LATERAL join.

These are different from the semi-variety.

They don’t just check for a match, they enumerate every single matching row combination (and preserve non-matching, varying with the degree of left/right/full outer-ness, of course).

For the extra-pedantic reader, I’m aiming for somewhere in the “punch” and “children” region of the taxonomy mnemonic, here.

  • One-to-many? You get every matching row from the many side, hooray!
  • Many-to-many? This is where you start reaching for DISTINCT and wondering what happened to your data model life.

So what do you call that category?

Not “full join”!

That’s a specific join type and using it as a category name is asking for a bad, confusing time. Like being a teenager, complete with a room full of adults being puzzled by your very presence in the world.

What I’ve come to discover in some research: Nobody ever really settled on a term.

The database theory world assumed inner/outer joins were the default, named the exception (semi joins), and called it a day. It’s sort of like if weekdays were just called days and weekends were called end-days (speaking of end-days, does anyone know when we can look forward to those?).

Brainstorming Gone Bad


That’s not good enough for me, so I started thinking through options.

  • Fanout Join
  • Enumerating Join
  • Multiplicative Join
  • Exhaustive Join (four ideas in, I was exhausted, clearly)

They all capture something useful, but they’re all longer and less clear than “semi join.”

Two or three syllables too many, usually. Syllables are also exhausting.

Worse, they’re clunky. #@^%ing clunky, Jimmy.

“Semi” works because it’s a snappy Latin prefix that means exactly what it says, and leaves plenty of room for immaturity amongst friends.

You need something equally snappy that means the opposite.

Tight, snappy, punchy, catchy. That’s what we’re known for here at Darling Data. Hello, sp_HumanEventsBlockViewer.

The problem is there’s no natural single-word English antonym for “semi” that prefixes onto “join” without sounding weird.

Yes, I consulted a thesaurus, that’s also where I got “antonym” from. Thanks for noticing.

Already overloaded by academia:

  • Whole Join
  • Complete Join

Embarrassing:

  • All Join

These sound like breakfast cereals for people having trouble getting one out the door.

This is unfortunate and undesirable.

At Darling Data, we try to avoid unfortunate and undesirable, that’s why we don’t work with MySQL.

Back To Latin 101


If “semi” came from Latin, then the answer probably lives there too.

Of course I had to look these all up, which was a tremendous mental strain. You’re welcome.

  • omni — all, every
  • pleni — full
  • toti — whole
  • holo (Greek) — whole

There’s a clear winner: Omni.

  • Same syllable count as semi (confirmed by clapping)
  • A prefix (again, loosely) people already know
  • Semantically accurate (enough for me, anyway)

You’re finding all matches, not just proof of life. Or death. Or afterlife.

And! It doesn’t conflict with any existing SQL Server database terminology.

Semi join and Omni join


Those belong together.

  • Same Latin family
  • Definite antonyms
  • You can put them on a slide and the audience gets the contrast

The database field never really bothered to name the category, because it was always the default.

But the default still needs a name when you’re explaining optimizer behavior, execution plan operators, or why your query just multiplied itself into sweet, sweet oblivion.

Enumerating joins, multiplying joins, set-based join, Cartesian-derived join, and result joins have all been used somewhat randomly in various database literature and documentation.

But those are unfortunate and undesirable too. Clunky, Jimmy.

Omni join!

You heard it here first.

Spread the word, far and wide. Like a monorail.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.