You Probably Don’t
When I work with clients, nearly every single one has this burning question about partitioning.
“We’ve got this huge table, should we partition it?”
“Do you need to insert or delete data in big chunks?”
“No, it’s all transactional.”
“Do you have last page contention problems?”
“No, but won’t it help performance?”
“No, not unless you’re using clustered column store.”
“…”
Burial
Buried deep in that conversation are the two reasons you might want to partition a table:
- Data management features
- Clustered columnstore indexes
Included in “data management” is the ability to put different partitions in different files/file groups, which you can make read only, and/or allow you do do piecemeal backups and restores.
Outside of those, you’re only introducing a whole mess of complexity to a whole bunch of different places, not to mention some pretty severe pains rebuilding or shadow-copying and switching your current “huge table” as a partitioned on.
Complexity
If you partition a table and plan on using it for data management, all of your other indexes need to be aligned to the partitioning scheme as well.
That adds complexity to index tuning, because if you mess up and don’t do that, your partition swapping will error out. A lot of people add DDL triggers to reject index definitions without the partition alignment.
Add to that, query plans and query writing both get more complicated, too. Not only that, but the query optimizer has to work harder to figure out if partitions can be eliminated or not.
Problems that might have been less problematic against non-partitioned tables and indexes (like some implicit conversions), can be a death spiral against partitioned tables.
There have been a lot of bugs and performance regressions over the years with partitioning, too. Some have been fixed, and others are only “fixed” in higher compatibility levels.
Lonely Road
I’ve seen a lot of people partition tables, expecting performance fireworks and afternoon delights, only to find no joy. In some cases, I’ve had to help people undo partitioning because critical queries slowed down.
It seems like a “free” thing you can do, especially if you’re clueless about how to find and fix real problems in your workload. Like most marquee features in SQL Server, there’s an air of mystery and misunderstanding about partitioning excels at and is best used for.
In most cases, SQL Server users seem to skip basic index tuning and go right to WE NEED TO PARTITION THIS THING.
Thanks for reading!
Going Further
If this is the kind of SQL Server stuff you love learning about, you’ll love my training. I’m offering a 75% discount to my blog readers if you click from here. I’m also available for consulting if you just don’t have time for that and need to solve performance problems quickly.
Great article. Partitioning definitely presents some tricky scenarios.
However, consider a scenario where you have a 500 million row table and you need to keep a rolling 12 months of data. Knowing that trying to delete 42M rows each month could be a painful operation. Praying of course someone doesn’t also try to do that in one big transaction. What would you propose as an alternative to table partitioning?
Today, we just do a partition swap each month then truncate the destination table.
Uh… isn’t that *exactly* what I said partitioning is a good use case for?
Thanks for confirming that as a suitable use case for partitioning.
I can see where you alluded to that being a “possible” use case scenario in the very short mention on Data Management Features. But with no direct mention on rolling windows, I was still curious of your thoughts.
Partitioned views are one other method for that, but I don’t have any material specific to using them for that.
Thanks!
Oooh yeah, had this talk many times myself too. These end quickly when people realize they don’t usually query by the partitioning column and will need to read dozens of indexes instead of just one.
Heh yeah, no one really understands what they’re in for, like it’s just flipping a switch.