SQL Server 2022: What The Heck Is sp_copy_data_in_batches?

Make It Or Not

I’m gonna be honest with you, dear reader, because without honesty we’ve got nothing.

Except lies — which you know — those can be comforting sometimes. Hm. I’ll have to think about that one for a bit.

While digging through to find new stuff in SQL Server 2022, this stored procedure caught my eye.

If you try to get the text of it, you get told off. It’s All Internal© as they say on the tubes.

EXEC sp_helptext 
    'sp_copy_data_in_batches';

Well, okay. But we can try to get it working on our own. Usually I use this method to figure out what parameters a new thing requires to run.

Not this time!

Trial And Error

What I usually do is stick NULL or DEFAULT after the EXEC to to see what comes back. Sometimes using a number or something makes sense too, but whatever.

This at least helps you figure out:

Number of parameters
Expected data types
Parameter NULLability
Etc. and whenceforth

Eventually, I figured out that sp_copy_data_in_batches requires two strings, and that it expects those strings to exist as tables.

The final command that ended up working was this. Note that there is no third parameter at present to specify a batch size.

sp_copy_data_in_batches 
    N'dbo.art', 
    N'dbo.fart';

Path To Existence

This, of course, depends on two tables existing that match those names.

CREATE TABLE dbo.art(id int NOT NULL PRIMARY KEY);
CREATE TABLE dbo.fart(id int NOT NULL PRIMARY KEY);

One thing to note here is that you don’t need a primary key to do this, but the table definitions do need to match exactly or else you’ll get this error:

Msg 37486, Level 16, State 2, Procedure sp_copy_data_in_batches, Line 1 [Batch Start Line 63]
'sp_copy_data_in_batches' failed because column 'id' does not have the same collation, 
nullability, sparse, ANSI_PADDING, vardecimal, identity or generated always attribute, CLR type 
or schema collection in tables '[dbo].[art]' and '[dbo].[fart]'.

Because GENERATE_SERIES is still a bit rough around the edges, I’m gonna do this the old fashioned way, which turns out a bit faster.

INSERT 
    dbo.art WITH(TABLOCK)
(
    id
)
SELECT TOP (10000000)
    id = 
        ROW_NUMBER() OVER 
        (
            ORDER BY 1/0
        )
FROM sys.messages AS m
CROSS JOIN sys.messages AS m2;

Behind The Scenes

I sort of expected to run some before and after stuff, and see the count slowly increment, but the query plan for sp_copy_data_in_batches just showed this:

I’m not really sure what the batching is here.

Also, this is an online index operation, so perhaps it won’t work in Standard Edition. If there even is a Standard Edition anymore?

Has anyone heard from Standard Edition lately?

Thanks for reading!

Going Further

If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.

7 thoughts on “SQL Server 2022: What The Heck Is sp_copy_data_in_batches?”

Stephen Morris says:

June 13, 2022 at 10:18 am

You can’t expect to see the magic if you only work with 10M rows at a time
1. Erik Darling says:
  
  June 13, 2022 at 11:19 am
  
  True, not a very impressive table.
  1. Stephen Morris says:
    
    June 13, 2022 at 11:59 am
    
    Maybe the batch size depends on the column widths ?
Eric Cox says:

June 13, 2022 at 3:46 pm

Looks like this is a helper procedure for converting regular tables into those new Ledger tables that everybody is totally using (right!?). Your results seems to suggest it doesn’t do anything special when the source and target are both regular tables.

https://docs.microsoft.com/en-us/sql/relational-databases/security/ledger/ledger-how-to-migrate-data-to-ledger-tables
1. Erik Darling says:
  
  June 13, 2022 at 7:35 pm
  
  Ah, yeah, that page looks like it was published several days after I wrote this. Thanks for showing me!
2. gordy says:
  
  June 13, 2022 at 10:01 pm
  
  append-only tables could end up being really interesting if they unlock new view indexing capabilities (eg an indexed max/min should be allowed when updates/deletes aren’t allowed)
  1. Erik Darling says:
    
    June 14, 2022 at 9:09 am
    
    I’m still looking for those snapshot views!

Comments are closed.