Join me at DataTune in Nashville, March 8-9 2024

Spring Training


This March, I’ll be presenting my full day training session The Foundations Of SQL Server Performance Tuning.

All attendees will get free access for life to my SQL Server performance tuning training. That’s about 25 hours of great content.

Get your tickets here for this event, taking place Friday, March 8th-9th 2024 at Belmont University – Massey Center 1900 Belmont Blvd, Nashville, TN 37212

Here’s what I’ll be presenting:

The Foundations Of SQL Server Performance Tuning

Session Abstract:

Whether you want to be the next great query tuning wizard, or you just need to learn how to start solving tough business problems at work, you need a solid understanding of not only what makes things fast, but also what makes them slow.

I work with consulting clients worldwide fixing complex SQL Server performance problems. I want to teach you how to do the same thing using the same troubleshooting tools and techniques I do.

I’m going to crack open my bag of tricks and show you exactly how I find which queries to tune, indexes to add, and changes to make. In this day long session, you’re going to learn about hardware, query rewrites that work, effective index design patterns, and more.

Before you get to the cutting edge, you need to have a good foundation. I’m going to teach you how to find and fix performance problems with confidence.

Event Details:

Get your tickets here for this event!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that, and need to solve database performance problems quickly. You can also get a quick, low cost health check with no phone time required.

Indexing SQL Server Queries For Performance: Fixing Aggregates With An Indexed View

Lead Up


I have a couple posts scheduled next week about aggregates, so now is a good time to continue my series on how you, my dear, sweet, developers, can use indexes in various ways to fix performance issues.

In this one, we’re going to get into a weird cross-section, because I run into a lot of developers who think that modularizing abstracting queries away in various things is somehow better for performance.

It’ll happen with functions a lot where I’ll hear that, but by far, it is more common to hear this about views.

Everything from results getting cached or materialized to metadata lookups being faster has been thrown at me in their defense.

By far the strangest was someone telling me that SQL Server creates views automatically for frequently used queries.

Hearts were blessed that day.

A Query!


So, here’s where indexed views are tough: THERE ARE SO MANY RESTRICTIONS ON THEM IT’S ABSURD.

They are borderline unusable. Seriously, read through the list of “things you can’t do” I linked to up there, and note the utter scarcity of “possible workarounds”.

How a $200 billion dollar a year company has an indexed view feature that doesn’t support MIN, MAX, AVG, subqueries (including EXISTS and NOT EXISTS), windowing functions, UNION, UNION ALL, EXCEPT, INTERSECT, or HAVING in the year 2024 is beyond me, and things like this are why many newer databases will continue to eat SQL Server’s lunch, and many existing databases (Oracle, Postgres) which have much richer features available for indexed (materialized) views point and laugh at us.

Anyway, here’s the query.

SELECT
    u.Id,
    u.DisplayName,
    QuestionUpScore =
        SUM(CONVERT(bigint, CASE WHEN p.PostTypeId = 1 AND v.VoteTypeId = 2 THEN  1 ELSE 0 END)),
    QuestionDownScore =
        SUM(CONVERT(bigint, CASE WHEN p.PostTypeId = 1 AND v.VoteTypeId = 3 THEN -1 ELSE 0 END)),
    AnswerAcceptedScore =
        SUM(CONVERT(bigint, CASE WHEN p.PostTypeId = 2 AND v.VoteTypeId = 1 THEN  1 ELSE 0 END)),
    AnswerUpScore =
        SUM(CONVERT(bigint, CASE WHEN p.PostTypeId = 2 AND v.VoteTypeId = 2 THEN  1 ELSE 0 END)),
    AnswerDownScore = 
        SUM(CONVERT(bigint, CASE WHEN p.PostTypeId = 2 AND v.VoteTypeId = 3 THEN -1 ELSE 0 END)),
    CommentScore = 
        SUM(CONVERT(bigint, c.Score))
FROM dbo.Users AS u
JOIN dbo.Posts AS p
  ON p.OwnerUserId = u.Id
JOIN dbo.Comments AS c
  ON c.UserId = u.Id
JOIN dbo.Votes AS v
  ON v.PostId = p.Id
WHERE u.Reputation > 500000
AND   p.Score > 0
AND   c.Score > 0
AND   p.PostTypeId IN (1, 2)
AND   v.VoteTypeId IN (1, 2, 3)
GROUP BY
    u.Id,
    u.DisplayName;

It just barely qualifies for indexed view-ness, but only if I add a COUNT_BIG(*) to the select list.

Now, here’s the thing, currently. We gotta look at a query plan first. Because it’s going to drive a lot of what I tell you later in the post.

A Query Plan!


Right now when I execute this query, I get a nice, happy, Batch Mode on Row Store query plan.

It’s not the fastest, which is why I still want to create an indexed view. But stick with me. It runs for about a minute:

sql server query plan
ONE MORE MINUTE

What’s particularly interesting is a Batch Mode Compute Scalar operator running for 22 seconds on its own. Fascinating.

Okay, but the important thing here: It takes one minute to run this query.

In Batch Mode.

An Indexed View Definition!


Let’s take a stab at creating an indexed view out of this thing, named after the Republica song that was playing at the time.

CREATE OR ALTER VIEW
    dbo.ReadyToGo
WITH SCHEMABINDING
AS 
SELECT
    u.Id,
    u.DisplayName,
    QuestionUpScore =
        SUM(CONVERT(bigint, CASE WHEN p.PostTypeId = 1 AND v.VoteTypeId = 2 THEN  1 ELSE 0 END)),
    QuestionDownScore =
        SUM(CONVERT(bigint, CASE WHEN p.PostTypeId = 1 AND v.VoteTypeId = 3 THEN -1 ELSE 0 END)),
    AnswerAcceptedScore =
        SUM(CONVERT(bigint, CASE WHEN p.PostTypeId = 2 AND v.VoteTypeId = 1 THEN  1 ELSE 0 END)),
    AnswerUpScore =
        SUM(CONVERT(bigint, CASE WHEN p.PostTypeId = 2 AND v.VoteTypeId = 2 THEN  1 ELSE 0 END)),
    AnswerDownScore = 
        SUM(CONVERT(bigint, CASE WHEN p.PostTypeId = 2 AND v.VoteTypeId = 3 THEN -1 ELSE 0 END)),
    CommentScore = 
        SUM(CONVERT(bigint, c.Score)),
    WellOkayThen =
        COUNT_BIG(*)
FROM dbo.Users AS u
JOIN dbo.Posts AS p
  ON p.OwnerUserId = u.Id
JOIN dbo.Comments AS c
  ON c.UserId = u.Id
JOIN dbo.Votes AS v
  ON v.PostId = p.Id
WHERE u.Reputation > 500000
AND   p.Score > 0
AND   c.Score > 0
AND   p.PostTypeId IN (1, 2)
AND   v.VoteTypeId IN (1, 2, 3)
GROUP BY
    u.Id,
    u.DisplayName;
GO

CREATE UNIQUE CLUSTERED INDEX
    RTG
ON dbo.ReadyToGo
    (Id)
WITH
    (SORT_IN_TEMPDB = ON, DATA_COMPRESSION = PAGE);
GO

This will create successfully but…

An Indexed View Creation Query Plan!


Indexed views can’t be created using Batch Mode. They must be created in Row Mode.

This one takes two hours, almost.

sql server query plan
EXCUSE ME WHAT

Yes, please do spend an hour and twenty minutes in a a Nested Loops Join. That’s just what I wanted.

Didn’t have anything else to do with my day.

Of course, with that in place, the query finishes instantly, so that’s nice.

Perils!


So yeah, probably not great that creating the indexed view takes that long. Imagine what that will do to any queries that modify the base data in these tables.

Hellish. And all that to produce a 30 row indexed view. Boy to the Howdy.

This is a bit of a cautionary tale about creating indexed views that span multiple tables. It is probably not the greatest idea, because maintaining them becomes difficult as data is inserted, updated, or deleted. I’m leaving the M(erge) word out of this, because screw that thing anyway.

If we, and by we I mean me, wanted to be smarter about this, we would have taken a better look at the query and taken stock of a couple things and considered some different options.

  • Maybe the Comments aggregation should be in its own indexed view
  • Maybe the Posts aggregation should be in its own indexed view (optionally joined to Votes)
  • Maybe the Comments and Posts aggregations being done in a single indexed view would have been good enough

Of course, doing any of those things differently would change our query a bit. Right now, we’re using PostTypeId to identify questions and answers, but it’s not in the select list otherwise. We’d need to add that, and group by it, too, and we still need to join to Votes to get the VoteTypeId, so we know if something was an upvote, downvote, or answer acceptance.

We could also just live with a query taking a minute to run. If you’re going to sally forth with indexed views, consider what you’re asking them to do, and what you’re asking SQL Server to maintain when you add them across more than one table.

They can be quite powerful tools, but they’re incredibly limited, and creating them is not always fast or straightforward.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that, and need to solve database performance problems quickly. You can also get a quick, low cost health check with no phone time required.

What Happens To Queries With Recompile Hints In Query Store

Worst Behavior


For many SQL Server developers, using statement-level OPTION(RECOMPILE) hints is the path of least resistance for tuning parameter sensitive queries.

And I’m okay with that, for the most part. Figuring out what parameters a stored procedure compiled with, and was later executed with is a hard task for someone busy trying to bring new features to a product.

But let’s say one day you have performance issues regardless of all the recompiling in the world. No matter what set of parameters get passed in, your procedure is just sleepy-dog slow.

Things get so bad that you hire a young, handsome consultant with reasonable rates like myself to help you figure out why.

The plan cache will be pretty useless for troubleshooting the recompile-hinted queries, but we can still use Query Store.

Example


Here’s a simple test procedure to show what I mean, named after a random set of cursor options.

CREATE OR ALTER PROCEDURE
    dbo.LocalDynamic
(
    @OwnerUserId integer
)
AS 
BEGIN
    SET NOCOUNT, XACT_ABORT ON;
        
    SELECT
        c = COUNT_BIG(*)
    FROM dbo.Posts AS p
    WHERE p.OwnerUserId = @OwnerUserId;
    

    SELECT
        c = COUNT_BIG(*)
    FROM dbo.Posts AS p
    WHERE p.OwnerUserId = @OwnerUserId
    OPTION(RECOMPILE);
END;
GO

EXEC dbo.LocalDynamic
    @OwnerUserId = 22656;

EXEC sp_QuickieStore
    @procedure_name = 'LocalDynamic';

The end result in Query Store (as observed with the lovely and talented sp_QuickieStore) will show two execution plans.

  1. The query without the recompile hint will show a compiled parameter value of 22656
  2. The query with the recompile hint will show the literal values used by parameters as predicates

Here’s what I mean. This query has a Parameter List attribute.

sql server query plan
cool, cool

This query won’t have the Parameter List attribute, but that’s okay. We can see what got used as a literal value.

sql server query plan
don’t despair

Plumbing


This all comes down to the way statement-level recompile hints work. They tell the optimizer to compile a plan based on the literal values that get passed in, that doesn’t have to consider safety issues for other parameter values.

Consider the case of a filtered index to capture only “active” or “not deleted” rows in a table.

Using a parameter or variable to search for those won’t use your filtered index (without a recompile hint), because a plan would have to be cached that safe for searching for 1, 0, or NULL.

If you’re troubleshooting performance problems using Query Store, and you’re dealing with queries with statement-level recompile hints, you just need to look somewhere else for the parameter values.

What this can make tough, though, if you want to re-execute the stored procedure, is if you have multiple queries that use incomplete sets of required parameters. You’ll have to track down other query plans.

But quite often, if there’s one problem query in your procedure, the parameter values it requires will be enough to go on.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that, and need to solve database performance problems quickly. You can also get a quick, low cost health check with no phone time required.

A Follow Up On HT Waits, Row Mode, Batch Mode, and SQL Server Error 666

A Follow Up On HT Waits, Row Mode, Batch Mode, and SQL Server Error 666


Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that, and need to solve database performance problems quickly. You can also get a quick, low cost health check with no phone time required.

Performance Tuning Batch Mode HTDELETE and HTBUILD Waits In SQL Server

Performance Tuning Batch Mode HTDELETE and HTBUILD Waits In SQL Server



Thanks for watching!

Here’s the demo query, in case you’re interested in following along.

SELECT
    u.Id,
    u.DisplayName,
    u.Reputation,
    TopQuestionScore = 
        MAX(CASE WHEN p.PostTypeId = 1 THEN p.Score ELSE 0 END),
    TopAnswerScore = 
        MAX(CASE WHEN p.PostTypeId = 2 THEN p.Score ELSE 0 END),
    TopCommentScore = 
        MAX(c.Score)
FROM dbo.Users AS u
JOIN dbo.Posts AS p
  ON u.Id = p.OwnerUserId
JOIN dbo.Comments AS c
  ON u.Id = c.UserId
WHERE EXISTS
(
    SELECT
        1/0
    FROM dbo.Votes AS v
    WHERE v.PostId = p.Id
    AND   v.VoteTypeId IN (1, 2, 3)
)
AND  u.Reputation > 10000
AND  p.Score > 0
AND  c.Score > 0
GROUP BY 
    u.Id,
    u.DisplayName,
    u.Reputation
ORDER BY
    TopAnswerScore DESC
OPTION
(
    RECOMPILE, 
    USE HINT('FORCE_LEGACY_CARDINALITY_ESTIMATION'), 
    USE HINT('QUERY_OPTIMIZER_COMPATIBILITY_LEVEL_140')
);

 

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that, and need to solve database performance problems quickly. You can also get a quick, low cost health check with no phone time required.

Join me at DataTune in Nashville, March 8-9 2024

Spring Training


This March, I’ll be presenting my full day training session The Foundations Of SQL Server Performance Tuning.

All attendees will get free access for life to my SQL Server performance tuning training. That’s about 25 hours of great content.

Get your tickets here for this event, taking place Friday, March 8th-9th 2024 at Belmont University – Massey Center 1900 Belmont Blvd, Nashville, TN 37212

Here’s what I’ll be presenting:

The Foundations Of SQL Server Performance Tuning

Session Abstract:

Whether you want to be the next great query tuning wizard, or you just need to learn how to start solving tough business problems at work, you need a solid understanding of not only what makes things fast, but also what makes them slow.

I work with consulting clients worldwide fixing complex SQL Server performance problems. I want to teach you how to do the same thing using the same troubleshooting tools and techniques I do.

I’m going to crack open my bag of tricks and show you exactly how I find which queries to tune, indexes to add, and changes to make. In this day long session, you’re going to learn about hardware, query rewrites that work, effective index design patterns, and more.

Before you get to the cutting edge, you need to have a good foundation. I’m going to teach you how to find and fix performance problems with confidence.

Event Details:

Get your tickets here for this event!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that, and need to solve database performance problems quickly. You can also get a quick, low cost health check with no phone time required.

Comment Contest: To Celebrate Groundhog Day, Let’s Celebrate The Groundhog DBA (Or Whatever)

Facing Up


When you’ve been working with the same technology for long enough, especially if you’ve found your niche, you’re bound to find yourself having to solve the same problems over and over again with it.

Even if you move to a new job, they probably hired you because of your track record with, and skillset in, whatever you were doing before.

One thing I do for clients is review resumes and interview applicants for developer/DBA roles. There are some laughable ones out there, where someone claims to be an expert in every technology they’ve ever touched, and there are some ridiculous ones out there that are obviously copy/paste jobs. I even came across one resume where the job description and qualifications were posted under experience.

As a performance tuning consultant, even I end up fixing a lot of the same issues day to day. There are, of course, weird and cool problems I get to solve, but most folks struggle with the grasping the fundamentals so bad that I have to ~do the needful~ and take care of really basic stuff.

There’s something comforting in that, because I know someone out there will always need my help, and I love seeing a client happy with my work, even if it wasn’t the most challenging work I’ve ever done.

Anyway, to celebrate the times where we’ve gotten to come out of our groundhog holes (I’m sure they have a real name, but I’m not a groundhogologist), leave a comment below with a cool problem you’ve gotten to solve recently.

The best comment (judged by me), will win free access to my SQL Server Performance Tuning Training Library.

In the case of a tie, there will be a drinking contest to determine the winner.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that, and need to solve database performance problems quickly. You can also get a quick, low cost health check with no phone time required.

bit Obscene Episode 1: What Developers Need To Know About Transactions

bit Obscene Episode 1: What Developers Need To Know About Transactions



In the inaugural episode, Joe Obbish and Erik Darling talk about how developers use and abuse transactions, and some surprises you might run into for them.

Links:

  • https://erikdarling.com/the-art-of-the-sql-server-stored-procedure-transactions/
  • https://erikdarling.com/batching-modification-queries-is-great-until/
  • https://www.youtube.com/watch?v=CuZSoZb8ziE
  • https://michaeljswart.com/2017/07/sql-server-upsert-patterns-and-antipatterns/

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that, and need to solve database performance problems quickly. You can also get a quick, low cost health check with no phone time required.

Indexing SQL Server Queries For Performance: Fixing Blocking and Deadlocking

Okay, Look


Before I go on about how adding an index can fix your blocking and deadlocking problems, let me warn you about how indexes can cause you problems.

Without fail, whenever I suggest adding indexes to clients, someone gets real worried about write speeds.

Like, the server can be burning down, spurting blood everywhere, and cursing our mothers in ancient Aramaic, but no, let’s stop and worry about write speeds going up by a few milliseconds.

You have to add quite a few indexes (20), and be modifying quite a few rows (millions) before the number of indexes really starts to hurt “write speeds”.

I haven’t seen a server whose biggest problem was write speeds (outside of Managed Instance and SQLDB) since spinning disks were measured in RPMs. The real problem I see many servers face from having “too many” indexes is increased locking.

The more indexes you add, the more you have to lock when modifications happen, even when you’re changing the same number of rows as you were before. You also increase your changes for lock escalation attempts.

Having a handsome young professional with reasonable rates (me) review your indexing is a good idea.

But you can end up with locking and deadlocking problems when you err in the opposite direction of “too many” indexes, especially if your modification queries don’t have good supporting indexes to help them find the data they wanna change.

It Started With A Scan


I never spent a summer at camp Scans-Are-Bad, but scans can tell us something important about modification queries.

In a very general sense, if the operators in your modification queries are acquiring data from the tables they need to modify via a scan, they’ll start by locking pages. If they start by seeking to rows in the table they need to modify, they’ll start by locking rows.

Once lock escalation thresholds are hit, they may attempt to lock the whole table. If the optimizer thinks your query needs to do a lot of work, it may use a parallel plan, which may increase the likelihood of lock escalation attempts.

Let’s say we have this query:

UPDATE
    p
SET
    p.Score += 1000
FROM dbo.Posts AS p
JOIN dbo.Users AS u
  ON u.Id = p.OwnerUserId
WHERE u.Reputation >= 800000
AND   p.Score < 1000;

And this is the query plan for it:

sql server query plan
we care a lot

We would care very much about the Posts table being scanned to acquire data, because the storage engine doesn’t have a very granular way to identify rows it’s going to modify.

We would care so very much, that we might add an index like this:

CREATE INDEX
    p
ON dbo.Posts
    (OwnerUserId, Score)
WITH
    (MAXDOP = 8, SORT_IN_TEMPDB = ON, DATA_COMPRESSION = PAGE);

To get an execution plan that looks like this:

sql server query plan
line up

Now, this isn’t awesome, because we’re updating the Score column, and we need the Score column in our index to find the rows we care about, but if the query goes from taking 5 seconds to taking a couple hundred milliseconds, we’re going in the right direction.

There is reasonably sound advice to avoid indexing columns that change a lot, but if you need to find rows to modify in columns that change a lot, you may need to follow the also quite reasonably sound advice to make your queries faster so there’s less chance for them to become entangled.

The longer modification queries run for, the longer they have to interfere with other queries.

It Started With A Tran(saction)


I often see transactions (ab)used by developers. Here’s a lazy example:

BEGIN TRAN;
    UPDATE 
        b
    SET 
        b.UserId = 138
    FROM dbo.Badges AS b
    WHERE b.Date >= '2010-12-25'
    AND   b.Date <  '2010-12-26';

    /*Go do some other stuff for a while, I hear ACID is good.*/
COMMIT; /*No error handling or anything, just screw it all.*/

The query plan for this update will look simple, harmless, never did nothin’ to nobody. But while we’re doing all that ~other stuff~ it’s holding onto locks.

sql server query plan
waterfall

This isn’t always the end of the world.

Right now, the only index is the clustered primary key. Some queries will be able to finish immediately, as long as they’re located in they’re in the clustered primary key prior to the rows that are being locked.

For example, this query can finish immediately because the date value for its row is a touch before the pages we’re updating.

SELECT
    b.*
FROM dbo.Badges AS b
WHERE b.Id = 1305713;
sql server query plan
take the cannoli

But this query, and any queries that search for an Id value on a locked page, will be blocked.

SELECT
    b.*
FROM dbo.Badges AS b
WHERE b.Id = 1306701

That Id value is just outside the range of dates we’re modifying, but because the storage engine is locking pages and not rows, it has to protect those pages with locks.

sql server query plan
step on

With this index, both queries would be able to finish immediately, because the storage engine would know precisely which rows to go after, and a more granular locking arrangement (rows instead of pages) would be available.

CREATE INDEX
    woah_mama
ON dbo.Badges
    (Date)
WITH
    (MAXDOP = 8, SORT_IN_TEMPDB = ON, DATA_COMPRESSION = PAGE);

In this example, we’re not updating the Date column, so we don’t have to worry about the sky falling on write speeds, or write speeds crying wolf, or a thousand other cautionary tales about overreaction.

NOLOCK, And Other Drugs


Let’s say we’re idiots unaware of how bad uncommitted reads are, and we decide to use NOLOCK hints to avoid our select queries being blocked.

SELECT
    b.*
FROM dbo.Badges AS b WITH(NOLOCK)
WHERE b.Id = 1306700;

While that transaction is open, and we’re still doing all those other highly ACIDic things to our database, this query will return the following results:

sql server query plan
jailbird

We can see the in-flight, uncommitted, change from the update.

Maybe we return this data to an end user, who goes and makes a really important, life-altering decision based on it.

And then maybe something in that transaction fails for some reason, and everything needs to roll back.

That really important, life-altering decision is now totally invalidated, and the end user’s life doom-spirals into sure oblivion.

And it’s all your fault.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that, and need to solve database performance problems quickly. You can also get a quick, low cost health check with no phone time required.

How Database Vendors Can Make It Easy For New Users To Get Started On Their Platform

New And Improved


There are a lot of databases out there. All of them, of course, better than their competitors.

Faster, easier, cloudier, smarter, shardier. Blah blah blah. Sounds great! But how do you get new users to try your product and make any reasonable assessment of it?

Not everyone comes to the table with a bunch of data they can jam right in and get partying with.

Worse, they might be on a platform where exporting their data to use on your platform (because, let’s face it, no two products have the same backup format) is a real and complete pain.

The number of hurdles that users face just getting to the point where they can make an assessment of your product are pretty huge.

Let’s talk about how you can make that easier.

Working Data Sets


Stack Overflow provides public access to their data sets, both as XML dumps, and a web interface.

I’m not saying that their relational data makes sense for every system out there, but it’s a big data set that’s good for analysis.

If you own a database product, you could:

  • Grab the dumps and turn them into a format that makes sense for your platform
  • Have a hosted playground for new users to run queries against interactively

This helps potential new customers get comfortable with the inevitable proprietary syntax, gauge query efficiency.

Microsoft, for all its silliness, gives SQL Server users a couple different sample databases to work off of. They even update them for new versions to show off all the new memes features they’ve added.

They even have a free developer edition of the product that you can install and run with pretty quickly. You don’t need this if your product is all cloud-based, but you get the idea.

Hands-down, the most annoying part of testing any database platform, is getting reasonable data to test against in there.

Portability


If you are an installer-based life form, and your database as a lot of settings that might matter for performance and reliability, or uses a specific OS, you should consider having a few different VM images available for download.

This lets you easily distribute a golden copy of an ideal environment for your product, with the OS, database, and data all packed together.

Oracle does this, and for the short time I had to experiment with some stuff on their platform, it was incredibly handy.

If you don’t want to go this route, because you don’t quite have Oracle money, being a fledgling database product, have a dedicated install and config page:

  • Recommended hardware
  • OS version
  • Database install steps
  • Any additional dependencies
  • Recommended database configurations
  • Where to get ample sample data to play with

While we’re talking about sample data, why not have a few different sizes of data? Not everyone wants to set up a 64 core, 2TB of RAM virtual machine just to mess with a petabyte set of time series data.

Have some small, medium, large, and extra large sets available for testing.

Sure, prospective clients might opt for small and medium, but the folks you want to evangelize your product are going to love you for having bigger data sets to show more complex problems and solutions.

If part of the sell for your product is how great data ingestion is, have data ready to ingest based on whatever format you excel at, even if it’s Excel files.

More likely it’s csv, parquet, json, or uh… something.

Visibility


A lot of folks are used to having more than a command line to interact with their database.

Postgres has pgAdmin, Oracle has SQL Developer, Microsoft has SQL Server Management Studio and Azure Data Studio, and there are many third party tools that can connect to a variety of platforms, too.

Writing large, complex queries in a CLI is a drag. It might be great for administration, and simple stuff, but correcting syntax errors in them is like threading needles blindfolded.

You may not want to build a whole bunch of tooling up front for developers to work in, but a lightweight browser-based tool with a “run” button can go a long way.

Take db<>fiddle as an example. You can’t do any database management with it, but you can pretty much fully interact with the database by sending queries in, the way a developer would write and test queries.

Nerdery


I love playing with other databases, but I do not love all the foreplay it takes just to get moving with one.

The more things are different about your platform — and those differences may be spectacular — the harder it is to lure away folks who are inexperienced with the stack you’ve chosen.

You might even have an amazing sales tech team who will come in and do a lot of the heavy lifting for prospective clients, but some companies out there want to do a test run before investing a lot of time and getting hourly check-ins from sales reps about how things are going and when the contract will get signed.

That also ignores one of the most powerful segments of any database community: the developers who will build content around your product, and go out in the world to blog, present, and develop training content.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that, and need to solve database performance problems quickly. You can also get a quick, low cost health check with no phone time required.