Comment by forthy

6 years ago

I doubt the competition (e.g. IBM or Microsoft) has any better code quality. Even PostgreSQL is 1.3M lines of code, so let's get something deliberately written for simplicity. SQLite is just 130k SLoC, so another order of magnitude simpler.

And yet, even SQLite has an awful amount of test cases.

https://www.sqlite.org/testing.html

I'm sure some of the difference (25M vs. 1.3M) can be attributed to code for Oracle features missing in PostgreSQL. But a significant part of it is due to careful development process mercilessly eliminating duplicate and unnecessary code as part of the regular PostgreSQL development cycle.

It's a bit heartbreaking at first (you spend hours/days/weeks working on something, and then a fellow hacker comes and cuts of the unnecessary pieces), but in the long run I'm grateful we do that.

  • > It's a bit heartbreaking at first (you spend hours/days/weeks working on something, and then a fellow hacker comes and cuts of the unnecessary pieces), but in the long run I'm grateful we do that.

    The single hardest thing about programming, I'd say.

    • In many (most?) ways the best edits of code are the ones where you can get rid of lines.

PostgreSQL has a lot of code but most parts of the code base have pretty high quality code. The main exceptions are some contrib modules, but those are mostly isolated from the main code base.

It's because software LOC scales linerarly with the amount of man-months spent: a testament to the unique ability of our species to create beautiful, abstract designs that will stand the test of time.

  • This is an interesting comment, because I can't decide if you are sarcastic or making a deep insightful comment. Because I don't think the statement is true. LOC can go on forever, but it usually happens in things that aren't beautiful and abstract.

Worked on sql server for 10+ years. MS SQL Server is way better than that. The sybase sql server code we started with and then rewrote was as bad as oracle.

I guess that is just because SQL as a standard is not coherent nor something beatifully designed. SQL is mashup of vendor specific features all bashed togehter into one standard.

  • There's also a lot of essential complexity there. SQL provides, in essence, a generic interface for entering and analyzing data. Imagine the number of ways to structure and analyze data. Now square that number to get the number of two tests for how two basic features of the language interact with each other. And that's not even near full test coverage.

    • Your point about essential complexity is absolutely correct, but your faux mathematical analysis is totally not a legit way to analyze the complexity of something or determine test coverage. I feel like as programmers we should be comfortable making sensible statements without making up shady pseudo-math to sound convincing.

      1 reply →