What Do You Call A Join That Isn’t A Semi Join?
There’s a gap in SQL Server database terminology that’s been bugging me as long as I’ve been teaching people about databases.
You’ve got semi joins, and you’ve got anti-semi joins.
Everyone (loosely, very loosely, everyone) knows what those do: Find a match, or confirm there isn’t one. Lemon-squeezey.
The engine doesn’t need to fully articulate the relationship if you have a one-to-many or many-to-many relationship between two tables.
Proof of existence (or non-existence) is enough. Short-circuit, move on. Once and one time only, to biblically know me, in the back of my go-cart.
Then you’ve got INNER JOIN, LEFT JOIN, RIGHT JOIN, FULL OUTER JOIN. Further, there’s SQL Server’s APPLY, and the ANSI-friendly LATERAL join.
These are different from the semi-variety.
They don’t just check for a match, they enumerate every single matching row combination (and preserve non-matching, varying with the degree of left/right/full outer-ness, of course).
For the extra-pedantic reader, I’m aiming for somewhere in the “punch” and “children” region of the taxonomy mnemonic, here.
- One-to-many? You get every matching row from the many side, hooray!
- Many-to-many? This is where you start reaching for
DISTINCTand wondering what happened to yourdata modellife.
So what do you call that category?
Not “full join”!
That’s a specific join type and using it as a category name is asking for a bad, confusing time. Like being a teenager, complete with a room full of adults being puzzled by your very presence in the world.
What I’ve come to discover in some research: Nobody ever really settled on a term.
The database theory world assumed inner/outer joins were the default, named the exception (semi joins), and called it a day. It’s sort of like if weekdays were just called days and weekends were called end-days (speaking of end-days, does anyone know when we can look forward to those?).
Brainstorming Gone Bad
That’s not good enough for me, so I started thinking through options.
- Fanout Join
- Enumerating Join
- Multiplicative Join
- Exhaustive Join (four ideas in, I was exhausted, clearly)
They all capture something useful, but they’re all longer and less clear than “semi join.”
Two or three syllables too many, usually. Syllables are also exhausting.
Worse, they’re clunky. #@^%ing clunky, Jimmy.
“Semi” works because it’s a snappy Latin prefix that means exactly what it says, and leaves plenty of room for immaturity amongst friends.
You need something equally snappy that means the opposite.
Tight, snappy, punchy, catchy. That’s what we’re known for here at Darling Data. Hello, sp_HumanEventsBlockViewer.
The problem is there’s no natural single-word English antonym for “semi” that prefixes onto “join” without sounding weird.
Yes, I consulted a thesaurus, that’s also where I got “antonym” from. Thanks for noticing.
Already overloaded by academia:
- Whole Join
- Complete Join
Embarrassing:
- All Join
These sound like breakfast cereals for people having trouble getting one out the door.
This is unfortunate and undesirable.
At Darling Data, we try to avoid unfortunate and undesirable, that’s why we don’t work with MySQL.
Back To Latin 101
If “semi” came from Latin, then the answer probably lives there too.
Of course I had to look these all up, which was a tremendous mental strain. You’re welcome.
- omni — all, every
- pleni — full
- toti — whole
- holo (Greek) — whole
There’s a clear winner: Omni.
- Same syllable count as semi (confirmed by clapping)
- A prefix (again, loosely) people already know
- Semantically accurate (enough for me, anyway)
You’re finding all matches, not just proof of life. Or death. Or afterlife.
And! It doesn’t conflict with any existing SQL Server database terminology.
Semi join and Omni join
Those belong together.
- Same Latin family
- Definite antonyms
- You can put them on a slide and the audience gets the contrast
The database field never really bothered to name the category, because it was always the default.
But the default still needs a name when you’re explaining optimizer behavior, execution plan operators, or why your query just multiplied itself into sweet, sweet oblivion.
Enumerating joins, multiplying joins, set-based join, Cartesian-derived join, and result joins have all been used somewhat randomly in various database literature and documentation.
But those are unfortunate and undesirable too. Clunky, Jimmy.
Omni join!
You heard it here first.
Spread the word, far and wide. Like a monorail.
Going Further
If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.
*patiently waits for Joe Celko to arrive and chime in*
Great article! 🙂
Ha ha, say his name in the mirror three times. I dare you.
“Great join”, Brent!
When does the “Omni Join by Darling Data” t-shirt drop? Another million dollar idea, easily 😉 !