Sphinx For The Mahogany
All the helper objects for the below demos are available on my GitHub repo.
Video Summary
In this video, I delve into how better indexes can significantly enhance the memory efficiency of SQL Server operations. I walk through two custom views that provide detailed insights into index usage and buffer pool activity. By running various queries and analyzing their impact on memory, I demonstrate how using specific indexes for certain operations can reduce the amount of data read into memory, leading to more efficient use of resources. Whether you’re dealing with a full table scan or filtering by a particular column, understanding these nuances can help optimize your SQL Server environment and improve overall performance.
Full Transcript
Howdy folks, Erik Darling here with Erik Darling Data, now complete with stickers. So I at least can’t close down shop until I’ve given all these stickers out. So you got me around for a little while. And today I wanted to talk to you about something that I think is really interesting in SQL Server and is how better indexes can help you make the most of your SQL Server’s memory. Now I have a couple views that I’ve written. I’m not going to go over the full text of them in the video, but they will be available in the blog post. One of them just kind of gives you a basic view of what indexes you have for, I mean, you can do it for a bunch of tables, but I have it filtered down to just the users table here. So you can see it’s a schema, the table name, the name of any indexes we have on there, how many rows are in the table, how many pages, some information about how big it is, how big the log data is. So good basic information about an index. And that is going to work alongside this other view that I’ve written that will tell you what’s going on in the buffer pool. So like what pages ended up in the buffer pool when you did a particular thing.
Right now we have the entire primary key clustered index for the users table up in the buffer pool. We’ll come back to that. The way this script is generally going to work is I’m going to select everything from the view so that we can see exactly how big and how many pages are in the index. So it’s right at the top. So we always see it. Then I’m going to drop the ever loving crap out of clean buffers and I’m going to clear out the proc cache just to make sure there’s nothing in there.
And to verify that I’m going to run a first select from the memory view. Right. So drop everything out, make sure nothing’s in there. And then I’m going to run some queries in different ways. And after after we’re done with that, we’ll check, we’ll do another check to see what ended up in the buffer pool. So we’re going to go from clean buffer pool, verify that, run a query, see what ended up in the buffer pool afterwards. Good stuff.
The first thing I’m going to run is a select count from the users table. So let’s unquote that. And I’m just going to go hit F5 up here and we’re going to be off to the races. So we have our initial set of information about the index. We have verified that the buffer pool is empty right here. We have our count query that we ran. We have a query plan for it. The, I don’t know why I stuck in there just in case anything happens.
And then we have our select query, right? We can select query from the buffer pool. So now we see that when we did, we ran that count query, we ended up with the, nearly the entire clustered index up in memory. Right. So when we got a count from the clustered index, we needed absolutely everything.
Right. We have 8,505 pages here, 8,505 pages here. We didn’t go and get all that lob data. So one really cool thing about SQL Server is that if you have lob data, but you don’t specifically ask for it, SQL Server doesn’t go run out and find it. They’re not going to read a bunch of lob data into memory if you don’t really need it.
But we can see that we did read everything else up into memory just to get that count. So we know now that if we need to count a whole table and we end up using that clustered index, we need the whole clustered index. But we can do it a little bit differently. Let’s quote this out and let’s run a count for just a single user.
Now the ID column here is the primary key clustered index. So there’s only going to be one user with this ID. And when we go looking to find this user, we’re not going to need to read the entire index in because SQL Server is just going to go get the pages that we need to find this particular user. So now we only end up with 15 pages in the buffer pool and we don’t end up with all 60 some 66 just megs in there.
Right. So there’s 8,505 pages in the index, but we only read 15 in to get that data. So that’s pretty nice. Right. So like SQL Server is also not going to read all of the index in when we don’t need the entire thing. Let’s run a query slightly differently. So I’m going to quote this out. And the first thing I’m going to do is run is I’m going to start using this query now.
So we have a count from the users table where the reputation is greater than or equal to 100,000, but we don’t have an index that helps this query. So when we run this and we get our information back, we’re going to see that once again, SQL Server had to read the entire clustered index up into memory to get the pages that we want to get a count there. This changes when we add an index on the reputation column. So let’s go up here and let’s create this index just on reputation to make our lives easier.
And we’ll rerun that query batch again. And now we can see that for the index information, we have this entry for the nonclustered index that we just created, which is far smaller than the clustered index. Right. It’s only 523 pages versus 8,505 because we can fit a whole lot more of just a single integer on data pages than if you have all the other columns that are in the users table. Now, when we look at what’s in the buffer pool, we only have about 10 total pages from our nonclustered index sitting in there.
So SQL Server is able to say, ah, we have this nonclustered index and I can find this data very easily. I’m only going to get 10 pages from here in order to give you this count. Now, this helps regular count queries too. And I’m going to show you how.
If we quote this back out and we rerun our initial count query, quote that out, run this. We’re going to get our index information again and we will eventually get the remainder of stuff. So there’s our, what’s in the buffer pool? I don’t know.
Somehow that one page stuck around in there. It’s a sneaky clustered index page. But now when we go, when we look at what got read into SQL Server, rather than using the clustered index, used our narrow nonclustered index on reputation.
Right. So we use a smaller index to get a count and we read less into memory. Right. So we rather than using all 8,505 pages of the clustered index, SQL Server was able to get an accurate row count from this nonclustered index, which is only 523 pages.
So smaller indexes can help there. And the right indexes can help SQL Server use memory much more efficiently. We didn’t have, rather than reading an entire index in and saying, I don’t know what we’re going to need. Let’s just get this all up here.
So you will serve. We can go find the right pages and say, hey, you come into memory. The rest of you guys hang out, sit on disk. You do something else for a little while. So pretty cool. I think. Anyway, that was it. I think. Was I anything else? No, I don’t think I had anything else. Cool. All right.
Well, thanks for watching. I hope you learned something. I hope you got some fun stuff out of this. Again, keep an eye out for the blog post. It’ll be out, I don’t know, sometime soon. I’m going to schedule everything today.
And that’s when the full scripts will be available for you to take a look at. Anyway, thanks for watching. And I’ll see you next time. Bye.
Video Summary
In this video, I delve into how better indexes can significantly enhance the memory efficiency of SQL Server operations. I walk through two custom views that provide detailed insights into index usage and buffer pool activity. By running various queries and analyzing their impact on memory, I demonstrate how using specific indexes for certain operations can reduce the amount of data read into memory, leading to more efficient use of resources. Whether you’re dealing with a full table scan or filtering by a particular column, understanding these nuances can help optimize your SQL Server environment and improve overall performance.
Full Transcript
Howdy folks, Erik Darling here with Erik Darling Data, now complete with stickers. So I at least can’t close down shop until I’ve given all these stickers out. So you got me around for a little while. And today I wanted to talk to you about something that I think is really interesting in SQL Server and is how better indexes can help you make the most of your SQL Server’s memory. Now I have a couple views that I’ve written. I’m not going to go over the full text of them in the video, but they will be available in the blog post. One of them just kind of gives you a basic view of what indexes you have for, I mean, you can do it for a bunch of tables, but I have it filtered down to just the users table here. So you can see it’s a schema, the table name, the name of any indexes we have on there, how many rows are in the table, how many pages, some information about how big it is, how big the log data is. So good basic information about an index. And that is going to work alongside this other view that I’ve written that will tell you what’s going on in the buffer pool. So like what pages ended up in the buffer pool when you did a particular thing.
Right now we have the entire primary key clustered index for the users table up in the buffer pool. We’ll come back to that. The way this script is generally going to work is I’m going to select everything from the view so that we can see exactly how big and how many pages are in the index. So it’s right at the top. So we always see it. Then I’m going to drop the ever loving crap out of clean buffers and I’m going to clear out the proc cache just to make sure there’s nothing in there.
And to verify that I’m going to run a first select from the memory view. Right. So drop everything out, make sure nothing’s in there. And then I’m going to run some queries in different ways. And after after we’re done with that, we’ll check, we’ll do another check to see what ended up in the buffer pool. So we’re going to go from clean buffer pool, verify that, run a query, see what ended up in the buffer pool afterwards. Good stuff.
The first thing I’m going to run is a select count from the users table. So let’s unquote that. And I’m just going to go hit F5 up here and we’re going to be off to the races. So we have our initial set of information about the index. We have verified that the buffer pool is empty right here. We have our count query that we ran. We have a query plan for it. The, I don’t know why I stuck in there just in case anything happens.
And then we have our select query, right? We can select query from the buffer pool. So now we see that when we did, we ran that count query, we ended up with the, nearly the entire clustered index up in memory. Right. So when we got a count from the clustered index, we needed absolutely everything.
Right. We have 8,505 pages here, 8,505 pages here. We didn’t go and get all that lob data. So one really cool thing about SQL Server is that if you have lob data, but you don’t specifically ask for it, SQL Server doesn’t go run out and find it. They’re not going to read a bunch of lob data into memory if you don’t really need it.
But we can see that we did read everything else up into memory just to get that count. So we know now that if we need to count a whole table and we end up using that clustered index, we need the whole clustered index. But we can do it a little bit differently. Let’s quote this out and let’s run a count for just a single user.
Now the ID column here is the primary key clustered index. So there’s only going to be one user with this ID. And when we go looking to find this user, we’re not going to need to read the entire index in because SQL Server is just going to go get the pages that we need to find this particular user. So now we only end up with 15 pages in the buffer pool and we don’t end up with all 60 some 66 just megs in there.
Right. So there’s 8,505 pages in the index, but we only read 15 in to get that data. So that’s pretty nice. Right. So like SQL Server is also not going to read all of the index in when we don’t need the entire thing. Let’s run a query slightly differently. So I’m going to quote this out. And the first thing I’m going to do is run is I’m going to start using this query now.
So we have a count from the users table where the reputation is greater than or equal to 100,000, but we don’t have an index that helps this query. So when we run this and we get our information back, we’re going to see that once again, SQL Server had to read the entire clustered index up into memory to get the pages that we want to get a count there. This changes when we add an index on the reputation column. So let’s go up here and let’s create this index just on reputation to make our lives easier.
And we’ll rerun that query batch again. And now we can see that for the index information, we have this entry for the nonclustered index that we just created, which is far smaller than the clustered index. Right. It’s only 523 pages versus 8,505 because we can fit a whole lot more of just a single integer on data pages than if you have all the other columns that are in the users table. Now, when we look at what’s in the buffer pool, we only have about 10 total pages from our nonclustered index sitting in there.
So SQL Server is able to say, ah, we have this nonclustered index and I can find this data very easily. I’m only going to get 10 pages from here in order to give you this count. Now, this helps regular count queries too. And I’m going to show you how.
If we quote this back out and we rerun our initial count query, quote that out, run this. We’re going to get our index information again and we will eventually get the remainder of stuff. So there’s our, what’s in the buffer pool? I don’t know.
Somehow that one page stuck around in there. It’s a sneaky clustered index page. But now when we go, when we look at what got read into SQL Server, rather than using the clustered index, used our narrow nonclustered index on reputation.
Right. So we use a smaller index to get a count and we read less into memory. Right. So we rather than using all 8,505 pages of the clustered index, SQL Server was able to get an accurate row count from this nonclustered index, which is only 523 pages.
So smaller indexes can help there. And the right indexes can help SQL Server use memory much more efficiently. We didn’t have, rather than reading an entire index in and saying, I don’t know what we’re going to need. Let’s just get this all up here.
So you will serve. We can go find the right pages and say, hey, you come into memory. The rest of you guys hang out, sit on disk. You do something else for a little while. So pretty cool. I think. Anyway, that was it. I think. Was I anything else? No, I don’t think I had anything else. Cool. All right.
Well, thanks for watching. I hope you learned something. I hope you got some fun stuff out of this. Again, keep an eye out for the blog post. It’ll be out, I don’t know, sometime soon. I’m going to schedule everything today.
And that’s when the full scripts will be available for you to take a look at. Anyway, thanks for watching. And I’ll see you next time. Bye.
Going Further
If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.
Demoman!
USE StackOverflow2010;
SET NOCOUNT ON;
GO
/*
--Basic index info
SELECT *
FROM dbo.WhatsUpIndexes AS wui
WHERE wui.table_name = 'Users'
OPTION ( RECOMPILE );
GO
--What's in the buffer pool?
SELECT *
FROM dbo.WhatsUpMemory AS wum
WHERE wum.object_name = 'Users'
OPTION ( RECOMPILE );
GO
*/
--CREATE INDEX ix_whatever ON dbo.Users (Reputation);
GO
SELECT *
FROM dbo.WhatsUpIndexes AS wui
WHERE wui.table_name = 'Users'
OPTION ( RECOMPILE );
GO
DBCC DROPCLEANBUFFERS;
CHECKPOINT;
DBCC FREEPROCCACHE;
CHECKPOINT;
GO 5
SELECT *
FROM dbo.WhatsUpMemory AS wum
WHERE wum.object_name = 'Users'
OPTION ( RECOMPILE );
GO
SET STATISTICS TIME, IO, XML ON;
SET NOCOUNT OFF;
--/*Select a count of everything*/
SELECT COUNT(*) AS records
FROM dbo.Users AS u
WHERE 1 = (SELECT 1);
--/*Select a count of one user*/
-- SELECT COUNT(*) AS records
-- FROM dbo.Users AS u
-- WHERE u.Id = 22656
--AND 1 = (SELECT 1);
--/*Select a count of rep > 100k*/
--SELECT COUNT(*) AS records
--FROM dbo.Users AS u
--WHERE u.Reputation >= 100000
--AND 1 = (SELECT 1);
SET NOCOUNT ON;
SET STATISTICS TIME, IO, XML OFF;
SELECT *
FROM dbo.WhatsUpMemory AS wum
WHERE wum.object_name = 'Users'
OPTION ( RECOMPILE );
GO
Greatness, and thanks for including the WhatsUp code 🙂