Get AI-Ready With Erik: A Little About Generating Embeddings
Video Summary
In this video, I dive into the world of vector columns and embeddings in SQL Server 2025, specifically focusing on Azure SQL Database and Azure Managed Instance. I explore the challenges and considerations involved when adding a vector column to an existing table, such as the need for indexing and potential performance impacts. To illustrate these concepts, I walk through creating a demo table called `post_embeddings_demo` and demonstrate how to use Ollama—a free and user-friendly tool—to generate embeddings for text data. By the end of this video, you’ll understand the process of backfilling your tables with vector data and appreciate the significant storage requirements involved. Whether you’re looking to enhance your SQL Server skills or simply curious about integrating AI into your database operations, there’s plenty here to learn. Don’t forget to check out my training site for the full course, where you can get a $100 discount using the coupon code AIREADY and Get AI Ready with Erik!
Full Transcript
Erik Darling here with Darling Data, returning once more into the vector void to try to sell this course. To try to help the community learn more about vector and SQL Server 2025 Azure SQL database and Azure managed instance. Sell this course, which you can buy from me now.
The, the low, low price of $100 off with the coupon code AIREADY. The course is of course, of course, of course, of course, called Get AI Ready with Erik. It is available over on my training site.
This is just small snippet teaser material from the aforementioned course that will hopefully spur you into buying this. So you can, you can show this to your boss and say, this, this, this man is trustworthy and will teach me all about AI and SQL Server. AI, right?
Anyway, got stuff to do, don’t we? Uh, now, we’re, we’re going to be completely honest with you. Uh, you most likely, um, in your heart of hearts, in your brain of brains, your guts of guts, will know that adding a vector column to an existing table is probably not a great idea.
Because, not only do you have to add this column to the table, which if it’s, you know, there are all sorts of metadata only things that make that not a big deal, but now you have to fill that column up. Uh, and if the, the, the table that you are attaching this column to is of any importance, um, you’re going to have a couple issues with it. One, uh, you know, the, the, the indexing for this table may need to start taking this column into account in some way.
Um, of course, you cannot have a vector column in the key of an index, but the includes might, might, might come into play some, at some point. Um, and now your table is about to get a lot bigger. Not only that, but you, you have to, you have to batch fill your table with this vector data.
If it’s a large table, you certainly don’t want to try to fill it all in one shot. So most likely what you will want to do is create a sort of lookup table for this data, where it can remain unattached from the rest of the data that you actually care about. But you can still, uh, you can still access fairly easily.
So in, in our case, uh, what we, what we, what we are going to do, or rather what I have done is created a sort of side piece table called post embeddings demo. Um, there’s an actual full post embeddings table in the database that I distribute for this course. But, uh, I use one called underscore demo so I can show you what filling, what backfilling, uh, vectors looks like, or embeddings looks like.
So, uh, this table is pretty simple. We’ve got an ID column. Um, we, we mostly, I mostly have this here because in later, um, modules, I guess we’ll call them.
They’re still YouTube videos, but in the, in the real course, they’re modules. So very, very fancy. Uh, we, we, we do explore vector indexes.
And one requirement for vector indexes is to have an ID, ID, integer column is the prime, is the clustered primary key for the table. Otherwise you, you cannot create a vector index. You get all sorts of errors and you can’t have a big int.
Can you imagine? You can’t have a damn 2026 and you can’t have a big int. We get kind of, who designed this? Oh yeah.
God. Anyway, uh, we’ve got some other stuff going on here. Um, you know, just sort of normal things, uh, as well as a fabulous foreign key, uh, that points back to the post table. Uh, for, you know, referential integrity and all that good stuff.
So here is our vector column up here, right? Here is our embedding column and is going to be a vector 1024 float. The reason it’s going to be a vector 1024 float is because of the embedding model that I am using currently for, um, uh, for, uh, generating embeddings, uh, based on, uh, the stuff that we’re generating embeddings for, right?
So, um, what we’re going to do is look at what generating embeddings will look like. Right? So we’re going to use this demo table to do that.
And what we’re going to start with is, uh, embedding the titles, right? So we’re going to start taking the text in these titles and turning them into those crazy arrays of floating point numbers, uh, enclosed in square brackets that mean things to computers, right? That mean things to the AIs that allow them to say, is this similar?
Hmm. Hmm. Let me see which direction this arrow points in. So, uh, that’s what we’re going to be doing, right? And to do that, we are going to use, uh, the wonderful and fabulous Ollama, uh, because it is free and it is easy.
And one reason why it is particularly good for this course material is because, A, it’s free and easy, but also, um, if you were to use something else like OpenAI or whatever, like if you’re in like Azure or something and you’re like, I want to use OpenAI. Let’s line Sam’s pockets with more crap. He needs another, he needs four more Lamborghinis.
I don’t know. Uh, you could, you could use that, but then we would all, you would have to pay per API call, right? You don’t have to do that with the local thing. So what we’re going to do is, uh, we’re going to go look at running a Python script to do the embedding.
Now there are, there are ways to generate embeddings locally in SQL Server that we could like, like, we’ll, we’ll get to where like, you know, you can, you can say, Hey, uh, I’m going to use an embedding model. Uh, and I’m going to, you know, call the AI generate embeddings thing. And with that model, and you could do that here, but it’s real slow.
All right. Like it’s like, like absurdly slow to do that, like within SQL Server. So, but I’ve got a little Python script, I don’t know, year 2026, Erik Darling with a Python script.
Who would have guessed it? At least it’s not PowerShell. You can’t get me on that one. I never said I hated Python.
It is sort of like coding with crayons, but you know, at least it’s not PowerShell. So I’ve got my, my VM over here and I have, um, did I install zoom it on the VM? No, I didn’t.
That’s okay. It doesn’t matter much. Um, we have this over here is the Olama server, right? So you, you, you can see right there, it says Olama serve. So this is where Olama is serving stuff from. And I think Olama is sending me, uh, some, some subliminal messages, but I, I just, I’m just too stupid to understand them.
What does it all mean? I don’t know, but this is, this window is not, not, not so important, right? This, this is not the important window.
The important window is kind of over here. So I’ve got, um, this is the Olama version that I’m currently running. And this is the model that I pulled down the MX BAI Embed Large. The MX BAI Embed Large is a pretty good model, right?
It’s a stunning model. Great on the runway. Uh, and it generates, uh, 1024 dimension embeddings. So, uh, that is why we have chosen vector 1024 for our column, for our vector column.
And SQL Server. So, uh, if we run this and we say Olama run MX BAI Embed Large, hello world. And we let that hang out for a second.
You’ll see that all sorts of stuff happened over in our Olama server window. I don’t know what any of this means. This is, I mean, lunacy, right? We’ve got, I don’t know, Bert. I don’t know who Bert is.
Uh, we generate 1024, uh, things. And you can see that our context length is 512 tokens right there. And, you know, so, you know, that like the part of the reason why I was griping about like the, the stack overflow bodies, right? Is because they’re long and you run out of those kind of quick and we got to think about chunking our bodies.
Uh, I’ve had enough body chunks, man. Anyway. Uh, when all this stuff runs and does its, you know, fancy AI things to, to the, to the help to hello world. Hello world turns into this, right?
Look at all these numbers. There are, I guarantee you, if you counted them, if you pause the video and you count as I scroll through, there will be 1024 of them. And these 1024 floating point numbers, uh, apparently describe hello world to, uh, to, to, to computers in a way where it can understand if something else is similar to hello world.
Magic, right? Absolute magic. So, uh, we’ve got our Python script over here and our Python script is going to, uh, call Olama, much like we called Olama for a single thing there.
And it is going to, uh, fill 1,000 rows in the post embedding demo table with, uh, with embeddings just like that. Right. Just, just like this, they’re all going to end up in there. So let’s, let’s run this thing and let’s see.
We are fetching up to a thousand questions without embeddings. Uh, and now Olama is real busy over here. Right.
Yeah. Uh, embeddings require, but some input tokens are not marked as outputs. Yeah. Look at it go racing along and we’ve got, well, you see, even Olama is not incredibly fast with this. Right.
You know, like, uh, like doing it in SQL Server is pretty, like, I wouldn’t do it for more than like, you know, maybe like five, 10 rows in a batch. Olama, we can, we can bump things up a little bit, but you know, uh, it, it, it moves along. It’s, it moves along a lot faster than SQL Server does.
So you can see, uh, our, our thing has finished. Uh, we have embedded 1000 rows. We have no errors and it took about 40 seconds for a thousand rows. So keep this, I mean, and like, you know, this is maybe not the grandest VM in the world, but just kind of keep these speeds in mind when you’re thinking about like backfilling your data with stuff.
Uh, you know, if, if you were an actual purchasing person of the course, you would get the Python script that does this. So I don’t know, maybe that would be useful to you. Then again, with, with AI being what it is, you can probably just tell it to make a Python script to do that for you.
But yeah, mine’s pretty good. I’ve added some stuff to mine. I’ve zhuzhed mine up a bit.
So, uh, now if we go look at our embeddings table, right? We have, uh, let’s see. I mean, there are, yeah, um, thousand rows in there, right? Thousand rows of embeddings.
And, uh, when we look at what we’ve got in here, we have our, uh, you know, stuff from the post table and stuff from the post embeddings table. This is just a small preview of what ended up in there, right? So all those floating point numbers.
So if we want to know how to horizontally center a div and another div, then this is what the, some of the, the, the floating point numbers that describe how to div a div and a div a div would look like to a computer. And we can see some information about our embeddings, uh, via sys.columns, right? We have a vector dimensions and a vector base type description column, which tell us how many dimensions our column is 1024, which we knew when we created the table and the base type is float 32.
This might be more useful to you if you were dealing with a system where, um, you were unfamiliar with their current vectorization of things. Anyway, uh, we also have a couple new functions in SQL Server that SSMS is pretty damn lazy on. Uh, here I am using SSMS 22.
Um, granted it’s not the newest 22.1 yet cause I’ve been busy and I can’t, I can’t upgrade SSMS every two minutes, but, uh, this vector property function is completely unrecognized by SSMS, but we can still get valid results back from it. We still get, we still get results. I promise.
Now, if we look a little bit at what the size of things look like, and hopefully I highlighted that from the correct point. Um, we have 1000 rows, uh, rather we have 1 million rows in our, in our table. Uh, well, sorry, let me back up a little bit here.
Uh, this is, I’m looking at the real post embeddings table. Now I’m not looking at the demo table with just a thousand rows in it. Okay. So like, this is, this is the actual table that I’ve already gone through and embedded everything in like all million embeddings. Right. So this is not just the demo table with a thousand rows.
And let me clarify that before I make SQL Server look worse than it is. So in our actual, uh, post embeddings table, we have a little over a million rows and, uh, are the size of our embeddings is about 4.2 gigs. Right. And this is just for like the, the titles and stuff. Right.
So this is, you know, pretty big. Um, it’s, you have to do a lot of interesting size forecasting. If you want to add this to a database, cause, um, it’s a lot going on.
Anyway, we’ll talk more about that later, but, uh, we, we, that, that gives us about, uh, about 4 megs per 1000 rows with the float 32, uh, and vector type. So, you know, we do have to think a little bit about, uh, how we’re gonna, how we’re, what, what, what data is really meaningful to embed because, um, you know, it takes up a lot of space.
Well, I don’t know, it’s a lot of space. No wonder storage vendors are so excited about, about vector data types. Cause, uh, sell a lot of diskettes. Anyway, that’s probably good for this one.
Uh, thank you for watching. I hope you enjoyed yourselves. I hope you learned something. And, uh, again, this, this, this, the entirety of this course is available for sale and purchase at my site here. You will find the link, this full link down in the video description.
If you would like to get a hundred bucks off, uh, you can do that. And you can Get AI Ready with Erik. Alright, thank you for watching.
Going Further
If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.




