Why should it be? You can get a long long way by treating the filesystem as a database. The first engineers at Amazon used the same technique a lot, as do I.
If it was always worse then every developer doing this must be stupid. Here are some ways in which a filesystem is "better":
- Zero administration
- Only configuration setting is the directory
- Trivial to test
- Trivial to examine with existing tools, backup, modify etc
- Works with any operating system, language, platform, libraries etc
- Good performance characteristics and well tuned by the operating system
- Easy for any developer to understand
- No dependencies
- Security model is trivial to understand and is a base part of operating system
- Data is not externally accessible
Many existing databases have attributes that aren't desirable. For example they tend to care about data integrity and durability, at the expense of other things (eg increased administration, performance). For a use case like HN, losing 1 out of every 1,000 comments wouldn't be that big a deal - it isn't a bank.
Consider the development, deployment and administrative differences between doing "hello world" with a filesystem versus an existing database. Of course this doesn't always mean filesystems should be used. Developers should be practical and prudent.
Just as one example, I think you could make a fairly convincing case that the official version of git uses a "filesystem as a database" system with great success.
I doubt it would be improved by using something more "proper".
Well, no. There are some exceptions, but most databases add a whole lot of bloat you don't necessarily need. Simple files can be just as fast or even faster than using a big database - which is the most important metric to me.
Even if that was true, I'd still tell people to start with flat files for a new project. It's like the advice to do a job yourself before hiring for it: You'll be better equipped to judge how well a database is managing your data if you've already done it yourself.
I find it interesting how easy it is to criticize a functioning system based on some aspect of non-conformance with some hypothetical ideal. The idea that HN wouldn't run on a database is no less astounding than the idea that much of facebook runs on PHP. Design of real systems is often messy and imperfect and deviates from the ideal due to necessity of optimizing one or another factor that may not be obvious.
Why should it be? You can get a long long way by treating the filesystem as a database. The first engineers at Amazon used the same technique a lot, as do I.
Because you always end up building your own database out of flat files and that is always worse than using an existing one.
If it was always worse then every developer doing this must be stupid. Here are some ways in which a filesystem is "better":
- Zero administration
- Only configuration setting is the directory
- Trivial to test
- Trivial to examine with existing tools, backup, modify etc
- Works with any operating system, language, platform, libraries etc
- Good performance characteristics and well tuned by the operating system
- Easy for any developer to understand
- No dependencies
- Security model is trivial to understand and is a base part of operating system
- Data is not externally accessible
Many existing databases have attributes that aren't desirable. For example they tend to care about data integrity and durability, at the expense of other things (eg increased administration, performance). For a use case like HN, losing 1 out of every 1,000 comments wouldn't be that big a deal - it isn't a bank.
Consider the development, deployment and administrative differences between doing "hello world" with a filesystem versus an existing database. Of course this doesn't always mean filesystems should be used. Developers should be practical and prudent.
TLDR: YAGNI, KISS, DTSTTCPW
3 replies →
And yet, here you are, on a site run off a flat file database.
15 replies →
Just as one example, I think you could make a fairly convincing case that the official version of git uses a "filesystem as a database" system with great success.
I doubt it would be improved by using something more "proper".
Well, no. There are some exceptions, but most databases add a whole lot of bloat you don't necessarily need. Simple files can be just as fast or even faster than using a big database - which is the most important metric to me.
Even if that was true, I'd still tell people to start with flat files for a new project. It's like the advice to do a job yourself before hiring for it: You'll be better equipped to judge how well a database is managing your data if you've already done it yourself.
I find it interesting how easy it is to criticize a functioning system based on some aspect of non-conformance with some hypothetical ideal. The idea that HN wouldn't run on a database is no less astounding than the idea that much of facebook runs on PHP. Design of real systems is often messy and imperfect and deviates from the ideal due to necessity of optimizing one or another factor that may not be obvious.
Facebook doesn't run on PHP and hasn't for a while... Its compiled C.