Moving Indexes To A New Filegroup: Microsoft Still Hates You
At some point you’re going to want to move some indexes to a new filegroup. Maybe you’re separating data across storage, maybe you’re cleaning up after someone who put everything on PRIMARY and walked away, maybe you’ve got your reasons and they’re none of my business.
Whatever the cause, you’d think this would be a solved problem in a database that’s been around since the Clinton administration.
It is not.
How bad it gets depends on what you’re moving. Let’s go from least painful to most painful, because the pain here is instructive.
Normal Indexes
We’ll define normal as an index that isn’t carrying gobs of LOB data around with it. If that’s what you’ve got, life is easier. Not simple, but easier.
The part that surprises people: you cannot just rebuild the index onto the new filegroup. There is no ALTER INDEX REBUILD WITH (MOVE_THIS_SOMEWHERE_USEFUL = ON). That would be too goddamned easy, and we don’t do easy here (unless it’s a RECOMPILE hint).
What you have to do is fully script the index out. Keys, includes, uniqueness, filters, and any particular settings it was created with.
Then you recreate it on the new filegroup with DROP_EXISTING turned on.
CREATE UNIQUE NONCLUSTERED INDEX
whatever
ON dbo.SomeTable
(
column_one,
column_two
)
INCLUDE
(
column_three
)
WHERE column_one > 0
WITH
(
DROP_EXISTING = ON,
ONLINE = ON,
(and all the other stuff you can or might want do, like PAGE compression)
)
ON [NewFileGroup];
Yes, you can make the new FG the default so that you don’t have to worry much about including it on every script, but who knows?
Maybe you created more than one new FG. You’re weird out there. I know you.
Miss a column in the include list, fumble the filter predicate, forget it was unique, and you’ve now changed the index instead of just moving it.
The work isn’t hard, exactly. Plenty of stored procedures and code examples exist to script out all your indexes.
It’s just tedious and unforgiving, which is its own kind of hard. Like a Cormac McCarthy book.
Heaps
If you’ve got heaps, your life is about to get worse.
What sucks is that I typed that and then realized it sounds like something an LLM would say.
Ah, screw it.
You can’t rebuild a heap onto a new filegroup, because there’s no index to rebuild. The data is just sitting there in a pile.
To move it, you have to put a clustered index on the table, which physically relocates the rows to wherever that clustered index lives.
If you’ve been meaning to fix those heaps anyway, congratulations, you get a small hit of satisfaction here. Build the clustered index, leave it, move on with a slightly better schema than you started with.
But if the table is supposed to be a heap, you’ve now got to script out dropping the clustered index you just created. Which turns it back into a heap on the new filegroup. So the move costs you a create and a drop for something that was never supposed to have an index in the first place.
LOB Data
Now we get to the part where I want someone at Microsoft to do this process.
Once. Just once.
There are products that should be experienced by the people who make them, and this is one of them. I get the sense that it often isn’t.
This applies to clustered tables with LOB columns, and it applies to your heaps with LOB data too, because LOB makes everything worse uniformly. Oh, and if you’ve got nonclustered indexes with LOB data in them, well… you, too.
When you do the create-with-DROP_EXISTING dance to move a table, the in-row data moves. The LOB data does not. It just stays where it was, staring at you, refusing to relocate. You can verify this yourself by checking allocation units before and after and watching the LOB_DATA unit sit exactly where it started.
The fix comes from a Kimberly Tripp post that has saved a lot of people a lot of grief over the years (Understanding LOB data). The trick relies on a quirk of how SQL Server handles partitioning: LOB data physically moves when the object transitions from non-partitioned to partitioned, or from one partition scheme to another. So you make the table partitioned, which forces the LOB data to move, even if you have no actual interest in partitioning anything.
The sequence goes like this:
- Create a partition function and a partition scheme.
- Apply the scheme to the table by creating the index on it with DROP_EXISTING. That moves the data onto the scheme.
- Then create the index AGAIN, this time onto the plain filegroup, with DROP_EXISTING once more, which makes the table non-partitioned again and moves everything, LOB included, on your target filegroup.
You read that correctly. It takes two index creates with DROP_EXISTING to move LOB data. The table briefly becomes partitioned for no reason other than to trick the engine into picking up the LOB allocation unit and carrying it along.
CREATE PARTITION FUNCTION pf_temp_move (bigint)
AS RANGE RIGHT
FOR VALUES (9223372036854775807);
CREATE PARTITION SCHEME ps_temp_move
AS PARTITION pf_temp_move
ALL TO ([NewFileGroup]);
/* Move onto the scheme. LOB comes with it. */
CREATE UNIQUE CLUSTERED INDEX
whatever
ON dbo.SomeTable
(some_bigint_column)
WITH
(
DROP_EXISTING = ON
)
ON ps_temp_move (some_bigint_column);
/* Move back onto a plain filegroup. Table is no longer partitioned. */
CREATE UNIQUE CLUSTERED INDEX
whatever
ON dbo.SomeTable
(some_bigint_column)
WITH
(
DROP_EXISTING = ON
)
ON [NewFileGroup];
And here’s the kicker: If the table has nonclustered indexes on it, both of those moves rebuild every one of them. Onto the scheme, then off the scheme.
You are reading that correctly too. Every nonclustered index gets rebuilt twice.
Say, have you been meaning to clean up some indexes for a while?
Picking A Boundary Value
The partition function needs a boundary point. You want a single boundary that sits higher than any value that exists or will plausibly ever exist in your clustering key, so that everything lands in one partition and nothing actually gets split up. You’re not partitioning for real. You just need the engine to think you are.
If you’re clustered on something with a sane data type, this is easy. Use the maximum value for the type:
- int: use the int max,
2147483647 - bigint: use the bigint max,
9223372036854775807 - uniqueidentifier: use
FFFFFFFF-FFFF-FFFF-FFFF-FFFFFFFFFFFF, but read the note below before you trust that - date, datetime, datetime2:
99991231is the standard pick
On the GUID one, be careful. SQL Server does not sort uniqueidentifier values by reading the bytes left to right the way you read them on screen. It sorts on the last group of six bytes first, then works backward through the groups. It’s a genuinely strange ordering and it trips people up constantly. The good news is that the all-F’s GUID still sorts highest no matter how you slice it, because every byte is already maxed out, so FFFFFFFF-FFFF-FFFF-FFFF-FFFFFFFFFFFF is a safe upper boundary. Just don’t go assuming the rest of GUID ordering matches what your eyes tell you.
If you came in from a heap, you’re free to pick whatever column has the most reasonable data type to build your temporary clustered index on. You’re dropping it afterward anyway (maybe), so choose the one that makes the boundary value easy.
And Then There Are The String Clusterers
I see you out there. Clustered on a string because at the time it seemed like a simple thing to do and nobody was around to stop you.
Now you’ve got extra thinking ahead of you. You need a boundary value that sorts higher than every string already in the column, which means working out how many bytes the column holds and then building a value out of enough z’s, or whatever sorts highest under your collation, to clear the top of your data. REPLICATE is your friend here, padding a character out to the column’s length so your boundary outranks everything.
/* A boundary higher than any value in a varchar(50) clustering key */
DECLARE @boundary varchar(50) = REPLICATE('z', 50);
And even that depends on your collation deciding that ‘z’ sorts above whatever garbage is actually in there. Mixed case, accented characters, and case sensitivity all get a vote. So you don’t just get to pick a max value off a chart like the rest of us. You get to go think about collation sort order.
Go to hell, all of you.
Your Online Operation Isn’t As Online As You Think
You set ONLINE = ON, you tell yourself you’re being a responsible adult, and you expect the move to glide along without anybody noticing. Then you watch shoving all that LOB data around generate a shitload of tempdb contention, and suddenly your nice online operation is causing blocking anyway, just through a side door.
So your “online” rebuild is online in the narrow sense that it isn’t holding a long schema lock on the table itself, but it’s lighting up tempdb badly enough that everything else fighting for tempdb pages gets to wait in line behind you. The blocking didn’t go away. It just moved somewhere you weren’t looking. Watch your tempdb allocation page contention while this runs, because that’s where the pain shows up, not on the table you’re moving.
ONLINE Is A Suggestion, Not A Promise
It gets better, by which I mean worse. ONLINE = ON only loosely guarantees that your operation won’t block anything. It is not the iron contract people treat it as.
Kendra Little wrote up a great example of an online rebuild that ran offline and took exclusive locks the whole way through, with no warning and no error (Ugly Bug: SQL Server Online Index Rebuild Sometimes Happens Offline Without Warning).
Her repro used ALTER INDEX REBUILD WITH (ONLINE = ON) on a table that had previously had a LOB column dropped, which leaves the table in a state where the engine falls back to an offline operation and holds X locks the whole way through.
Joe Obbish also wrote a post about something similar on my site, here. Joe’s post is about column store indexes, though. That may not apply to you.
WAIT_AT_LOW_PRIORITY, the thing that’s supposed to be your lord and savior from the schema lock at the end, offered her no protection against those locks. This is the same LOB ghost haunting you from a different room. Maybe under your bed. Maybe in your closet. Maybe in your fridge.
The broader point is the one that matters even when you’re not stepping on that specific bug. Your online index build still has to take its locks, minimal as they’re supposed to be. And if something is already in the way when it goes to take them, your online operation gets blocked. Now it’s sitting there waiting, and everything that shows up behind it gets blocked too, because it’s holding its place in the lock queue while it waits for the lock it needs. One stuck online rebuild turns into a blocking chain, and that chain can sit there for a long, long time while you wonder why a “no downtime” operation took your application down. This also makes you look like an asshole for saying that “I can do this fully online and not cause any blocking, boss”.
There’s also some version voodoo waiting for you here. Because moving an index to a new filegroup forces you into CREATE INDEX with DROP_EXISTING rather than ALTER INDEX REBUILD, your access to WAIT_AT_LOW_PRIORITY depends on your version.
For ALTER INDEX, that option has been around since SQL Server 2014. For CREATE INDEX, the WAIT_AT_LOW_PRIORITY syntax only showed up in SQL Server 2022, along with Azure SQL Database and Managed Instance.
If you’re on 2019 or earlier and doing a filegroup move, the one saving grace you’d reach for to manage the Sch-M lock at the switch-in isn’t available to you, even though the people doing plain in-place rebuilds have had it for years.
This is exactly the situation I wrote ProtectSession for. If you’re going to kick off one of these moves on a busy server, set yourself up to watch for and deal with the blocking it can cause, rather than finding out from the people whose queries are piling up behind it.
The Short Version
Normal indexes cost you a careful script and one create. Heaps cost you a create and a drop (maybe). LOB data costs you a partition function, a partition scheme, two creates, and a rebuild of every nonclustered index twice. And if you clustered on a string, you also get to do collation homework before you can even write the boundary value down.
None of this needed to be this way. But here we are, and the data isn’t going to move itself.
An additional thing to consider: SQL Server has many different build strategies for indexes. It may choose to build indexes on giant tables single-threaded. It may even choose to build all 23 nonclustered indexes on a huge table single threaded while you’re partitioning on/partitioning off to move LOB data.
The story gets even more tawdry and sordid if you’re using an Availability Group in Synchronous mode. You might see a lot of really nasty pile ups on HADR_SYNC_COMMIT. You do have the option of switching to manual failover and asynchronous commit for a bit, but that’s between you and your RPO goals. If you’re moving a significant amount of data, it may be a long wait.
Get into sports, dummy, as a wise man once wrote on a bathroom wall.
Going Further
If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.