DiskANN Vector Index Improvements
Chapters
- *00:00:00* – Introduction
- *00:00:30* – Problems Solved
- *00:01:00* – Read-Only Issue Fixed
- *00:01:30* – Testing and Availability
- *00:02:00* – Index Creation Speed
- *00:02:52* – Migrating Existing Indexes
- *00:03:23* – Query Syntax Improvements
- *00:03:44* – Top Syntax Enhancements
- *00:04:07* – Approximate Top Function
- *00:04:31* – Optimizer Choices
- *00:05:00* – Exciting Optimizations
- *00:05:14* – Future Outlook
- *00:05:29* – On-Premise Availability
- *00:05:43* – Conclusion
Full Transcript
Erik Darling here, Darling Data. And I finally have something to be excited about in the vector area. It would figure that I just finished, you know, wrapping up my sort of YouTube expose into vector search in SQL Server 2025 and saying, man, like, Microsoft doesn’t make some fixes here. Like, I don’t know where this story’s going. But lo and behold, at, well, I guess, SQL Server, SQL Con, uh, this, uh, last week in Atlanta, uh, it was announced that a lot of the problems that I had with disk and indexes, uh, are, are gone now. So congratulations, a round of applause to, uh, everyone who worked on that. This is wonderful news because now Microsoft actually has a pretty good story around, uh, vector search in SQL Server that it just didn’t have before. So the, the, the two main things were, uh, one, uh, the, when you added a vector index, uh, to a table, the whole table became read only. That has been, that has been fixed now. That has been worked out. So you’re, you can write to your table. So people can both write to, to your, to your tables and like do normal stuff. And, and, and, and the vector index doesn’t stop that. Uh, so that, that is wonderful. That is fantastic news. Uh, this, this feature finally has a strong pair of legs under it. Uh, they’ve also done some other stuff, um, where, uh, along the way. Um, I think the other main thing in here, I haven’t had a chance to test any of this out. It’s rolling out pretty slowly to some of the, um, some of the Azure, uh, regions, but I have, I’m using my robot friends to probe them. I haven’t found one where, uh, this is available yet. So maybe it’s still a little too soon, but I just haven’t found it yet. Maybe, maybe I just missed it. I don’t know. You can never trust those robots. They are, they’re kind of lazy sometimes. They’re like, yeah, I checked all that. Sorry, nothing there. And you’re like, but I see it. And they’re like, oh, ah, sorry. I missed that one. But, uh, anyway, uh, some of the other cool stuff that they did.
Um, was speed up, uh, the, the creation of vector indexes. Uh, if you remember some of my videos where I showed you, uh, how slow it was and the insane amount of code that ran behind the scenes on that. Uh, apparently that’s all gone. I have, again, I have not yet tested it. So I don’t know what the improvement is or if that weird code still happens, but just runs faster now. We’re going to wait and see. But it seems like the way, um, it seems like fundamentally the way that, um, like the vector indexes get created now is just, totally, uh, different in storage engine and behind the scenes. And there’s not like 3000 lines of strange code with bizarre use hints running. So this is, this is a very, this is very good news for us here in vector land. Um, I guess there’s an important note about migrating existing indexes, but if you were crazy enough to use a preview feature and create indexes, Oh, I mean, I guess read the, read the warning there. Um, of course, as soon as I start recording this, it becomes the noisiest day in the world. I had a plane fly by, there’s ambulances going. I can’t win sometimes.
Uh, but the other thing that they did that I think was really cool is, um, let me get, scroll down to this part. Um, the query syntax and, uh, filtering bits. If zoom, it will cooperate. I’m going to give me Mark Vassinovich’s number. Uh, can file a complaint about zoom, about zoom it here. Uh, but it used to be that you use, like when you wrote a query, like, uh, the one on, well, I guess further, right. Uh, you, you, you had to ask for a much higher top end number, uh, sometimes because you didn’t know like how many things it would find. So if you wanted like the top 20, uh, from like the outer query, but you asked for the top end in the inner query, uh, you might not get as many back as you asked for in the inner query.
And so your outer top 20 would not be 20. So you had to sometime ask for like the top end 100 or 200 in order to make sure that you got 20 back. But all that has apparently been improved. Uh, the top syntax has apparently been extended. So top with approximate, that’s going to be fun to mess with. Uh, I can’t wait to get my hands on that one. See what, see what I can see again. I wonder if it’s only applicable with vector searches or if top with approximate is, uh, is, is usable in other, uh, non-vector index, uh, non-vector, non-vector searches, but we’ll, we’ll see.
Um, maybe, maybe that’s said in the post. I don’t know. I haven’t read all of it too closely. I just got so excited. But anyway, uh, if zoom, it will unzoom now, now that I’m done with you. Thank you. Uh, apparently there’s also some cool optimizer stuff in here, um, where the optimizer will choose depending on, hello, zoom it. Uh, the optimizer will choose between when to do a vector search, when to do an exact search, uh, based on, I guess, some various factors here. So, uh, again, very good job, um, everyone who worked on this. This is very exciting stuff for those of us who, um, have an interest in vector search and SQL Server 20, well, I guess not just 2025.
So I suppose it’s all in Azure as well. Uh, not, not, not being a huge disappointing stink bomb. So, uh, this, this all looks great to me. It all sounds great to me. As soon as I get my hands on it and I get to start messing with it, I will, I will of course report back. And, um, what do you call it? What was the other thing? Uh, I don’t know. So, uh, I, I tried to ask about when this might make its way to us, uh, earthly denizens who you, who still use on-prem SQL, uh, what cumulative update it might land in, but not sure on that yet. Um, so, anyway, uh, exciting news. Very happy about this. Again, good job to all involved and, uh, I cannot wait to get my hands on it.
Alright. Thank you for watching.
Going Further
If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.