Comment by etblg
9 months ago
> event distincts were done in memory! I can’t fathom the decision making there…
This is one of those situations where I can't tell if they're operating on some kind of deep insight that is way above my experience and I just don't understand it, or if they just made really bad decisions. I just don't get it, it feels so wrong.
> I can't tell if they're operating on some kind of deep insight that is way above my experience and I just don't understand it
This is answered at the very top of the link on the post you replied to. In no unclear language, no less. Direct link here: https://github.com/prisma/prisma/discussions/19748#discussio...
> I want to elaborate a bit on the tradeoffs of this decision. The reason Prisma uses this strategy is because in a lot of real-world applications with large datasets, DB-level JOINs can become quite expensive...
> The total cost of executing a complex join is often higher than executing multiple simpler queries. This is why the Prisma query engine currently defaults to multiple simple queries in order to optimise overall throughput of the system.
> But Prisma is designed towards generalized best practices, and in the "real world" with huge tables and hundreds of fields, single queries are not the best approach...
> All that being said, there are of course scenarios where JOINs are a lot more performance than sending individual queries. We know this and that's why we are currently working on enabling JOINs in Prisma Client queries as well You can follow the development on the roadmap.
Though this isn't a complete answer still. Part of it is that Prisma was, at its start, a GraphQL-centric ORM. This comes with its own performance pitfalls, and decomposing joins into separate subqueries with aggregation helped avoid them.
It's a completely ridiculous answer though. They're linking to High Performance MySQL's 2nd edition, which came out in June 2008, and was written for users of MySQL 5.0 running on 2008-era hardware.
My take, as a MySQL expert: that advice is totally irrelevant now, and has been for quite some time. It's just plain wrong in a modern context.
I’m not even sure it was correct for its time? The whole point of an RDBMS is to execute join operations. The only reason I’d suspect an RDBMS to be bad at its one fundamental job, in any point of time, is the N+1 query scenario or multiple left joins with unrelated dependencies, but that’s triggered by bad orm abstractions to begin with
1 reply →
And, in light of that, the default is changing in a couple of months after the JOIN mode has had a significant period of time being tested in the real world.
It really gives me flashbacks to the early days of mongodb.
Which, frankly, is a good lesson that marketing and docs and hype can make up for any amount of technical failure, and if you live long enough, you can fix the tech issues.
> if they're operating on some kind of deep insight
If one's getting OOM errors from a SELECT DISTINCT, then there's no deep insight behind the choice, it's just a mistake.