← Back to context

Comment by throwaway848

10 years ago

PG was quite proud of having just used the file system as a database with Viaweb, claiming that "The Unix file system is pretty good at not losing your data, especially if you put the files on a Netapp." His entire response is worth reading in light of the above article, if only to get a taste of how simultaneously arrogant and clueless PG can be: http://www.paulgraham.com/vwfaq.html

PG and company still think this is a great idea, because the software this forum runs on apparently also does the file system-as-database thing.

No wonder Viaweb was rewritten in C++.

EDIT: to those downvoting, my characterization of PG's response is correct. Its arrogance is undeniable, with its air of "look how smart I am; I see through the marketing hype everyone else falls for," as is its cluelessness, with PG advocating practices that will cause data loss. Viaweb was probably a buggy, unstable mess that needed a rewrite.

You're spending your Sunday ragging on someone with a throwaway and being hypervigilent about downvotes on your throwaway. Get a grip mate.

  • being hypervigilent about downvotes on your throwaway

    Well, FWIW, it's working. At this instant his total karma is +10, he only has 2 posts, and the 2nd post is slightly greyed. So that means that his original comment is now in the range of +10.

    Sadly, the fact that I'm replying here probably means that I need to get a life!

It's clear you're ignorant of the capabilities netapp filers had, even back then.

WAFL uses copy on write trees uniformly for all data and metadata. No currently live block is ever modified in place. By default a checkpoint/snapshot is generated every 10 seconds. NFS operations since the last checkpoint are written into NVRAM as a logical recovery log. Checkpoints are advanced and reclaimed in a double buffering pattern, ensuring there's always a complete snapshot available for recovery. The filer hardware also has features dedicated to operating as a high availability pair with nearly instant failover.

The netapp appliances weren't/aren't perfect, but they are far better than you're assuming. They were designed to run 24/7/365 on workloads close to the hardware bandwidth limits. For most of the 2000's, buying a pair of netapps was a simple way to just not have issues hosting files reliably.

Perhaps you should take your own advice and dial back the arrogance a bit.

> No wonder Viaweb was rewritten in C++

Or maybe this massive non-sequitur is reason enough for downvotes. Also note this from the link you provided:

> But when we were getting bought by Yahoo, we found that they also just stored everything in files

Trusting Netapp is way better than 99% of status quo. It has had ZFS style integrity for a pretty long time.

I think PG was pretty much right in his judgement. Any file system is going to be pretty good on a reliable block store, such as Netapp.

>clueless

I was curious why pg did it that way. Here's a brief comment from him:

>What did Viaweb use?

>pg 3160 days ago >Keep everything in memory in the usual sort of data structures (e.g. hash tables). Save changes to disk, but never read from disk except at startup.

So similar to what Redis does today but a decade before Redis and likely faster than the databases of the day. Could have been important with loads of users editing their stores on slow hardware. Anyway it worked, they beat their 20 odd competitors and got rich. I'm sceptical that it was a poor design choice.

I downvoted you for lowering the tone of the conversation with personal insults. If you had just said something like 'note that PG's advice about using the Unix file system as a database is not now considered best practice,' that would have been fine.

The article implies very little about the aptness of 'using the file system as a database' for a specific application.

Your comment begs the question: Do you fall for the marketing hype? And if not, then do you think you should keep quiet about stuff that works?

At the time, IMHO, PG was indeed smart to be one of the few using FreeBSD as opposed whatever the majority were using.

But he has admitted they struggled with setting up RAID. They were probably not too experienced with FreeBSD. I am sure they had their fair share of troubles.

PG's essays and his taste in software are great and the software he writes may be elegant, but that does not necessarily mean it is robust.

Best filesystem I have experienced on UNIX is tmpfs. Backing up to permanent storage is still error-prone, even in 2015.

  • > At the time, IMHO, PG was indeed smart to be one of the few using FreeBSD as opposed whatever the majority were using.

    Why was it a better OS choice at the time than, say, Solaris or IRIX or BSDI?

    • I would have if it served my needs, because those others required special hardware and/or hefty licensing costs.