Quick SQL Server CPU Comparison Tests

Sometimes I have a need to run a quick CPU comparison test between two different SQL Server instances. For example, I might be switching from old hardware to new hardware and I want to immediately see a faster query to know that I got my money’s worth. Sometimes I get a spider sense while working with virtualized SQL Server instances and want to check for problems. Yesterday, I was doing a sort of basic health check on a few servers that I hadn’t worked with much and I wanted to verify that they got the same performance for a very simple query.

The Test Code

To get a single CPU core to 100% within SQL Server with minimal setup, the best method known to me uses a temporary procedure that does a simple operation within a WHILE loop:

CREATE OR ALTER PROCEDURE #p AS
BEGIN
    SET NOCOUNT ON;
    DECLARE @i BIGINT = 0, @time DATETIME2 = SYSUTCDATETIME();
    
    WHILE @i < 1000000
    BEGIN
        SET @i = @i + 1;
    END;
    
    SELECT cpu_time, DATEDIFF(MILLISECOND, @time, SYSUTCDATETIME()) elapsed_time
    FROM sys.dm_exec_requests
    WHERE session_id = @@SPID;
END;

GO

EXEC #p;

The code is extremely efficient (from a certain point of view) and CPU time will closely match elapsed time as as long as there’s no other processes on the lucky CPU that runs my beautiful code. It’s designed to spend as little time on waits and latches as possible. A temporary stored procedure is used to avoid ASYNC_NETWORK_IO waits. Paul White explains why the ASYNC_NETWORK_IO waits appear when the same code is run as part of a batch in this stack exchange answer.

CPU Problems

Going back to the intro, you can probably guess that I did not see identical CPU times when comparing the two servers. Otherwise, there would be no reason to write a blog post. The first server that I tested took about 2400 CPU ms to execute the code. The second server took about 300 CPU ms. There are of course a lot of factors in play here, but I would expect a healthy machine to take somewhere around 200-800 ms of CPU work. I definitely wouldn’t expect an 8X difference for two servers that were supposed to have identical performance!

To be clear, the only thing that you can safely conclude in this kind of situation is that there’s some kind of unknown configuration difference between the two servers. It does not necessarily mean that there’s some kind of severe issue with the server that takes more CPU time to perform the same work. With that said, it’s definitely suspicious and not something that you want to unexpectedly see. Running down the root cause of the issue can be difficult and time consuming because there are a lot of possible causes. Here is an incomplete list:

Hardware differences
Power plan and other OS configuration settings
OS patching, especially side channel vulnerability fixes
Enabling hyperthreading (CPU reporting within SQL server doesn’t have the same meaning)
Disabling Intel Turbo Boost
VM CPU oversubscription
VMware’s Enhanced vMotion Compatibility (EVC) – force a CPU to run on an older instruction set for compatibility reasons
Other VM issues
SQL Server version
Expensive monitoring configured within SQL Server including certain extended events, trace flags, profiling, or traces

In my case, the culprit ended up being a popular SQL Server third party monitoring solution (I will decline to name the vendor). Again, we can’t conclude that the vendor is doing something wrong. Different types of monitoring will have different overheads for different types of work within SQL Server. In some cases, simply asking for a SQL Server actual plan can more than double the execution time of a query. That doesn’t mean that we shouldn’t use actual plans as performance diagnosis tool!

Digging In

There was a fair amount of circumstantial evidence that the monitoring tool was responsible. It was running on the server with the slower code and it wasn’t running on the server with the faster code. Disabling the monitoring tool made both CPU times match. However, how can we really prove that the monitoring tool is the culprit? Where exactly is that extra CPU time going?

ETW tracing can provide a summary of call stacks during a sampled time period. Common choices are Windows Performance Recorder or PerfView. As usual, I will use PerfView. I was able to temporarily add third party monitoring to the server with the fast code. Fortunately, there wasn’t any other SQL Server work occurring during my testing so PerfView was quite effective at showing the difference between the two servers. In the image below, call stacks for slightly modified code without the monitoring tool are on the left. Call stacks for the same code with the monitoring tool present on the left are on the right:

a49 perfview side by side

It doesn’t take a SQL Server expert to see that there are some notable differences here. I’m not an expert in SQL Server call stacks, but a lot of the stuff on the left is about what I would expect for extremely efficient code in a loop. It’s a similar feeling to those people who can identify photoshopped pictures by looking at the pixels. A lot of the stuff on the right is what I would expect if SQL Server is spending a relatively large percentage of CPU time doing monitoring, such as ntdll!RtlQueryPerformanceCounter and sqllang!TraceUtil::GetStatementInfoBase. Here’s a diff view if that makes it easier to see some of the differences:

a49 perfview diff

The “exc” column is the difference of the number of CPU ms spent by the method itself (not including called methods). A positive number means that the slow code spent more CPU time on that method and a negative number (not shown in this picture) means that the fast code spend more CPU time on that method.

I would describe the PerfView results as fairly conclusive evidence: the third party SQL Server monitoring tool is responsible for the 8X difference in CPU time for the WHILE loop between the two servers.

Final Thoughts

The real production workload does something more important (and complicated) than simply incrementing a variable. Seeing an 8X performance difference in that simple code wasn’t a good or welcome sign, but I expect to see a dramatic reduction in monitoring overhead when benchmarking the real workload after disabling the monitoring tool. There may not be anything to fix here, but it’s always helpful to be reminded of everything that’s running on your servers. Thanks for reading!

9 thoughts on “Quick SQL Server CPU Comparison Tests”

Brent Ozar says:

September 22, 2021 at 2:35 am

Great post, Erik
1. Stephen Peter Morris says:
  
  September 22, 2021 at 3:11 am
  
  Great post Joe !
  1. Beardy McBeardFace says:
    
    September 22, 2021 at 12:13 pm
    
    Great Post
2. Jeff Moden says:
  
  December 2, 2021 at 5:34 pm
  
  Heh… it was about 15 minutes ago that I realized that I’d made the identical mistake on a much more recent post.
Chris says:

September 27, 2021 at 10:15 am

Interesting – We recently (finally) migrated off of SQL2008R2 to 2019 – this query, runs nearly 2x as slow on 2019 for me. Both installs are fairly standard…

Obviously I haven’t dug into this as I *Just* tried this – but that’s an unexpected result for sure. Any off the top of the head guesses? At the very least it is actually using a full core (live CPU running this in a loop shows a full core pegged).
1. Chris says:
  
  September 27, 2021 at 10:21 am
  
  FWIW – the estimated number of rows returned on 2008R2 is half what it is on 2019 and the estimated operator cost is also halved.
mbourgon says:

September 27, 2021 at 3:11 pm

If that we-dare-not-speak-its-name is SentryOne, then I have a solution – disable the statement-level trace it does. There’s a setting for its normal trace, but another for statement-level. I can confirm that if you’re doing a lot of dynamic SQL (and possibly xquery) , then it absolutely is a performance-sinker. Note that we also had the same problem with QueryStore originally (I don’t think I’ve revisited it in the past couple of years), so it’s not just them.
Vesa says:

October 27, 2021 at 6:44 am

I have tried this on server where I suspect there is heavy VM CPU oversubscription.
My results are usually so, that elapsed time is about double of cpu time.

Can I make any conclusions?
My guess is that CPU is not available and elapsed time is double because sos_Scheduler_yield waits.
Pingback: Trying out the new premium-series Azure SQL Managed Instances – Erik Darling Data

Comments are closed.

The Test Code

CPU Problems

Digging In

Final Thoughts

Related Posts

9 thoughts on “Quick SQL Server CPU Comparison Tests”