Comment by snidane
4 years ago
It's easier not to mess up table based filters using explicit semi-join operators (eg. in, not in, exists) instead of using regular joins because joins can introduce duplicates.
Give me 'any join' operation - ie. just select the first value instead of all, and I'll happily use joins more. They are actually more intuitive.
It's not that relational algebra is untintuitive. It's because standard SQL sucks.
Indeed, I've taught myself to only use JOIN when I actually need some data from the table I join. For everything else I use EXISTS and friends.
I was thinking SQL could do with a keyword for that, like maybe FILTER, that looks like a JOIN but works like EXISTS.
Clickhouse implements an explicit SEMI join. It can be called semi or any, it doesn't really matter. It's just another join modifier
https://clickhouse.com/docs/en/sql-reference/statements/sele...
My problem with semijoins is that the semantics of "what exactly does a SELECT evaluate to inside an expression" are sometimes murky and might vary across databases.
Could you expand a little?
If I write
what the heck is the result of evaluating the inner query, in the outer expression?
Maybe I am missing something, but the exact meaning to vary a lot across different databases. Some seem to have a standalone "table" data type, while others don't.
1 reply →
“First” doesn’t make sense without an order
It even has its own tag on StackOverflow: https://stackoverflow.com/questions/tagged/greatest-n-per-gr...
People who want it, want it with an order.
Look at
https://stackoverflow.com/questions/121387/fetch-the-row-whi...
and
https://stackoverflow.com/questions/3800551/select-first-row...
and
https://stackoverflow.com/questions/8748986/get-records-with...
and their combined thousands of votes and dozens of answers, all full of awkward workaround or ill-performing or specialised-for-one-database-engine code for this common and desirable thing which would be trivial with a couple of boring loops in Python.
It does make sense for semi-joins. I care about the key, not the value.
Random order is also a valid order.