Get AI-Ready With Erik: Vector Index Preview Feature Limitations

Get AI-Ready With Erik: Vector Index Preview Feature Limitations


Summary

In this video, I delve into the limitations of vector index preview in SQL Server 2025, sharing insights from my course “Get AI Ready with Erik.” I highlight key constraints such as read-only tables, restrictions on inserts and updates, and the necessity for an integer primary key. Additionally, I discuss practical strategies for managing data across multiple tables to accommodate these limitations, emphasizing the importance of planning ahead to ensure smooth operations when using vector indexes in your database environment.

Topics

`SQL Server 2025`, `Vector Indexes`, `Preview Feature`, `Limitations`, `AI in SQL Server`, `Data Modification`, `Read-Only Tables`, `Partitioning`, `Replication`, `Integer Primary Key`, `Clustered Index`, `Embeddings`, `Azure SQL Database`, `Stale Vector Index`, `Disabling Vector Indexes`, `Table Management`, `Vector Search`, `Vector Distance`, `Roadmap`, `SQL Server 2025 Features`

Chapters

  • *00:00:00* – Introduction
  • *00:00:30* – Vector Index Preview Limitations
  • *00:01:00* – No Inserts, Updates or Deletes Allowed
  • *00:01:30* – Table Read Only
  • *00:02:00* – Partitioning Not Supported
  • *00:02:46* – Replication to Subscribers Not Supported
  • *00:03:08* – Integer Primary Key Required
  • *00:03:48* – Azure SQL Database Regions
  • *00:04:15* – Disabling Vector Indexes
  • *00:04:49* – Managing Data with Vector Indexes
  • *00:05:22* – Combining Searches
  • *00:07:17* – Stone Age of Vector Indexes
  • *00:08:33* – Microsoft’s Priorities

Full Transcript

Erik Darling here with Darling Data. Here to try to continue to educate you about AI and the vector stuff in SQL Server 2025. This is all, you know, small snippets of material from my course, Get AI Ready with Erik. You can buy that for a hundred bucks off with the coupon code up here. And when you do, you support me doing menial things in life, paying rent, cable bills, cell phone bills, nothing fun, right? No joy, no joy, just AI. But we’re going to talk about vector index preview feature limitations in this one. And because they are significant, and it’s hard to imagine, like, it’s hard to imagine releasing them in the current state as generally available. Like, these are just, this is just the stuff that we know about. This is not all the stuff behind the scenes that, you know, probably still needs tidying up, fixing all the other stuff. But just to sort of zoom in on some other things. When you create a vector index on a table, there are no inserts, updates, or deletes allowed. This is probably the number one reason that vector indexes are preview only. You know, like, we didn’t have the preview feature switch when columnstore indexes came out and like, you know, they were like, oh, these make the table read only two. And we didn’t have that. So if we did, maybe that would have been something. But if you try to do anything, you will get this funny error data modification statement failed, because table post embeddings has a vector index on it. So no inserts, updates or deletes from you. So just from the Microsoft docs, you know, obviously, tables read only, so that stinks. You can’t, like you can’t partition tables at vector indexes on. I don’t know if I care so much about this, right? Like, partitioning is a data management feature that everyone seems to confuse for a performance feature. And yeah, I just I just don’t care. You know, if it honestly, if that kept happening, and people were just stopped asking questions about partitioning their tables, I’d be I’d be thrilled. Another one that I kind of don’t care about no replication to subscribers, replication. This is like capital R replication, not like availability group, like, you know, log shipping, mirroring, stuff. This is like, you know, merge or transactional replication, you can’t replicate vector indexes to subscribers. Again, if this gets people away from replication, all in favor of it, right? Yeah, what a nightmare.

An integer primary key is required. So this is this is like for us like a single column. So like I think this one is probably the funniest one to me. Because Microsoft is almost like self limiting you in this way. Like, like, like, like, you can’t use like a big int, right? Like anyone who’s table like tables like are they’re like, wow, this table is going to get big, we need a big int for this. Microsoft’s almost like 2 billion rule limit, pass this, I can’t make any promises, right? Like that’s, that’s, that’s a funny one, right? So it has to be an integer has to that has to have a cluster primary key on just the integer. You can’t have any composite keys. You can’t like, you know, have your cluster primary key on a GUID or something like that.

Or something like that has to be the integer. That’s an amusing one. I don’t know the story behind that. But you know, maybe maybe they’re trying to save you from yourself. And of course, if you drop the index and you insert data into the table, you have to rebuild it if you want to start using it again, that one’s fairly straightforward. Azure SQL database, at least in some regions, where the vector, the disk and vector indexes have rolled out to they, as of this, as of this date, not all Azure regions support this.

They do have a setting called allow stale vector index that allows rights, but the index will be stale until you rebuild it. This is not available in SQL Server 2025. You can’t currently disable a vector index, right? You have to drop it, right? If you try to disable the vector index, you will get this error that says one or more of the specified alter index options is unsupported for a vector index. So no disabling vector indexes. So if you have a system where you need to add data regularly, where like, you know, to a table, you’ll probably want to keep embeddings in a separate table from like, you know, your transactional stuff where people need to do things so that you may continue to make money off them.

You know, you might even set it up. So like, you have two tables. One of them is for, you know, like rows that you are actively, you know, like have like an active vector index on them. And another table that gets like new records that you have to like, you know, that you haven’t put into the vector search table yet, because you have to sort of like batch that, right? You would have to set up a process where, you know, like it’s some cadence that makes sense to you, you would have to drop the vector index.

You know, like either, you know, you’ve already either generate embeddings for new records or already have them like in like a second table in like put the new put the like newer embeddings into the vector into the table that you want to have the vector index on and then rebuild the vector index. Then you could, you know, you could combine searches like, like one, like, you know, like, so like a union all query where one of them does the vector search, the other one does the vector distance search on the unindexed data. So the downside for that is if you are like, if you have like sort of like hot vector data, you know, like new records wouldn’t be searchable with vector search until you, until you like batch them into the new table, you would have to continue to use like the vector distance to do that.

But the general pattern that you would want to follow is combine them. Right. So almost like, you know, like some people like some people will pretty commonly have like, like, like, like, not really like partitioned with capital P partitioned, but they’ll have like sort of like an archival table that has like much older stuff in it. And they’ll have a newer table for like newer data, maybe even like three tables where it’s like, you know, you have like archived and then like, you know, like lukewarm data and then like hot data.

And you’ll just like union all those together with like constraints on each so that like if there’s a date based search, SQL Server knows which one to go to. But you could you could do something sort of similar with your like, you know, not archived, but like indexed vector data and your new vector data where you just sort of union all two things. And just like, well, this I can vector search this I need to vector distance because I don’t have an index on that yet.

And I can’t use vector search without a vector index. And I can’t just put the data directly into the table with the vector index on it because the vector index makes a table read only. You would just have to sort of find a way to combine both of those, which, you know, right, right now is not a great story.

And that’s why vector indexes are in preview. They’re not, you know, they’re not in the generally available category because they’re sort of a nightmare, right? Like it makes a lot of processes not fun to use.

There’s a lot of feature interoperability that just isn’t great with stuff. So, like, I get it. I get why Microsoft made the choice.

But, you know, you know, vector indexes will probably be nice someday. But for now, like we’re looking at like the stone age of vector indexes in SQL Server. And, you know, I think what’s what’s really depressing about it, you know, from like someone who cares about the database product quite a bit is there’s like no roadmap.

There’s no timeline. There’s no like, you know, it’s just sort of like, oh, we’ll get to it when we get to it. Like, stay away.

It’s, you know, you can’t get an answer from anyone on anything. And, you know, you can’t help but feel that Microsoft’s priorities in other areas are taken away from what what was trotted out by all sorts of folks at the company to be like the flagship feature of SQL Server 2025. Whenever everything that we saw, like the word SQL Server was barely in like the marketing header.

It was like it’s like like, yeah, it’s like SQL Server 2025. But it was like fabric, AI, ground to cloud to whatever. And you’re like, OK, well, where’s where’s the database?

And then didn’t we? Isn’t that what we care about? Isn’t that what we’re here for? So I don’t know. It would be cool if, you know, we had people were a little bit more transparent about when these things might stop sucking. But for now, we must embrace the suck for the suck has embraced the warm embrace of suckiness is what we have.

Anyway, that’s probably enough here. Thank you for watching. I hope you enjoyed yourselves.

I hope you learned something and I will see you. Oh, I don’t know when I when I see you. Who knows? Maybe I’ll just quit. Maybe that’s enough of this. All right.

Thank you for watching.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.