How OPTIMIZE FOR SEQUENTIAL KEY Improves Highly Concurrent Insert Performance

Everyone Loves A SQL

Thanks for watching!

Video Summary

In this video, I delve into the intricacies of the `OPTIMIZE FOR SEQUENTIAL KEY` feature in SQL Server, providing a detailed demonstration that showcases its potential benefits. After making some adjustments to my demo script—such as passing parameters instead of literals and increasing the number of threads from 400 to 500—I ran the test with both `OPTIMIZE FOR SEQUENTIAL KEY` off and on. The results were quite telling: when turned on, the operation completed in just 24 seconds compared to around 35 seconds without it. Additionally, I observed significant improvements in metrics like page latch EX and write log, which dropped dramatically. While these metrics are interesting, they may not fully capture the overall impact on performance for more complex scenarios. This video aims to provide insights into when this feature might be beneficial, especially for specific workload patterns, while acknowledging that there’s still much to learn about its full implications in production environments.

Full Transcript

Howdy folks, Erik Darling with Erik Darling Data. Again, to talk a little bit more about that optimize for sequential key thing. Because I made some changes and I think I have a better example of how it works. Now, I’m going to run the demo script while I talk about some of the changes. But the changes that I made to the demo script are the demo script itself are, well, I should point out a couple things. One is that I changed my procedure a little bit before I was passing in some literals. I noticed that sometimes, not every time, but sometimes I was getting weird, very high compilations. Like it was like literally a one for one of every time this thing ran, it compiled. And I wanted to control for that a little bit. So rather than passing in literals, I’m going to pass in some parameters from up here. Those will get set in the script down here. And the other thing that I did was I booted my script. I boosted the threads way, way up. That should actually be 500, not 400. So now I have 500 threads running for 1000 iterations against the table. The table itself hasn’t changed. But let me kick this off. And let me do this and this. And while this runs, the other change that I made was that I put my VM back on my SSD bank. In my desktop at home, I have an NVMe card and I have an SSD bank.

For some reason, when this VM is on the NVMe card, my write log weights are bizarre. I don’t know if it’s something weird with the way that my CPU talks to the NVMe card. I’m sure Glenberry could tell me in about three seconds what’s wrong with my setup. But for some reason, this behaves a little bit better when it’s sitting on the SSDs. I don’t know if that’s a point of interest for anyone. But another thing that I did was I changed my VM setup a little bit. What I have for my desktop is actually six physical cores with 12 up here with hyper threading. But I wanted to have my physical cores in the VM only matched up to a number of physical cores within the server. So I changed that to six. Now, when I run 500 threads for 1000 iterations against the table over here with optimize for sequential key off, that takes around about 35 seconds down there.

If we come over here and look at what Blitz first measured during that time, we have page latch EX. We have about 7200 seconds of it. We have page latch SH for about 3300 seconds. And we have write log for about 2500 seconds. That’s all cool. I don’t know if that’s good, bad or ugly for whatever 500 times 1000 singleton inserts is. I have no idea. But I’m going to show you now what happens when we change this to turn this on to turn optimize for sequential key on so we hopefully get all the benefit of that feature. And I’m going to make sure that table is extra dead and buried. I’m going to get this ready to rerun. And to not fully spoil the surprise, I’m not going to change the time over here. We’re just going to hit execute. And then we’re going to come over here and we’re going to let this run. And we’re going to let 500 threads go on this thing.

Now, I’m not going to lie. I’ve recorded this several times. And several times the demo gods have bit me in the butt with the timing. But I’m confident this time. And I’m confident this time because I’ve told many gods now that I really would just like to go have a glass of rosé on this hot ass day. And there we go. The gods listened. And the gods had that thing finished in 24 seconds. So a full 11 seconds better than with optimize for sequential key off. If we look at our weight stats now, page latch EX is dramatically lower at 100 seconds to what was like think 3350 seconds.

Write log is also reduced. It was at about 2500 seconds before. It’s down to 1900 seconds now. Page latch SH is almost non-existent. So a lot of metrics improved alongside turning that on. I don’t know that those are all metrics that I would rely on to tell me if I needed to turn this on. Or if they were metrics that I would keep my eyes peeled for if I turn this on in production, I might just really want to make sure that I’m getting better throughput on things.

Now, that’s tough to measure. It’s easy for me here because I’m running O-Stress on my desktop and I look at the number of threads and I look at the number of iterations and I can see exactly how long they ran for. In real life, when you have to go figure out if you should turn this on or if turning it on might do something awful to your server, because we still don’t know how it works for various reasons. That Microsoft hasn’t told us yet. They probably have some stuff that they want to think about patiently before they write it down.

I imagine it’s going to be a very thoughtful exercise in blogging on their part. But anyway, where was I? I forget. Rosé, glass of, that’s right. Anyway, thank you for watching. I’m going to keep my eye on this feature because I think it holds a lot of promise for some fairly specific workload patterns, but still a pretty cool thing to see Microsoft working on.

Anyway, thanks for watching and I will see you in another video after several glasses of Rosé. Goodbye.

Video Summary

Full Transcript

Anyway, thanks for watching and I will see you in another video after several glasses of Rosé. Goodbye.

Going Further

If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

2 thoughts on “How OPTIMIZE FOR SEQUENTIAL KEY Improves Highly Concurrent Insert Performance”

Wilfred van Dijk says:

June 10, 2021 at 5:20 am

Will this feature only improve the inserts if the (first column of a) clustered index is a tiny/small/large integer?
1. Erik Darling says:
  
  June 10, 2021 at 6:31 am
  
  It should also work for variations of dates and times.

Comments are closed.

Everyone Loves A SQL

Video Summary

Full Transcript

Video Summary

Full Transcript

Going Further

Related Posts

2 thoughts on “How OPTIMIZE FOR SEQUENTIAL KEY Improves Highly Concurrent Insert Performance”