Comment by ensiferum
5 years ago
Because correctness is important to me. I don't want my programs to silently go about in a buggy state producing incorrect results in a corrupted state.
5 years ago
Because correctness is important to me. I don't want my programs to silently go about in a buggy state producing incorrect results in a corrupted state.
Not all bugs put the whole program in a corrupted state. Especially if your prigram is written to be mostly stateless. A common pattern is that a state change is tried, it fails because of a bug or some other error, and it is reverted, and an error shown to a user. I would call this a robust program. Of course, not any error can be recovered this way, but it is very often possible (assuming we are taking about memory-safe languages; otherwise, the balance of probabilities is entirely the other way around).
In practice you always have mutated state in a complex system and not everything moves in transactional steps.
People seem scared to dump core but I find that doing it makes my programs much robust and also simpler. I have no muddled logic to "deal with" bugs in the program itself. They simply will abort the program.
I replied the same in another thread, but why not dump the core of all the processes on the system, instead of just the one that encountered the error? And why stop on this system, and not dump core on other systems that were communicating with this one over a network?
The process is one boundary of isolation, and you are making a bet that the corruption has not crossed this boundry. You can take the same bet with subcomponents of the process, just as in larger systems style may actually prefer to reboot the whole machine or even kill it and spin up another.
This all depends on the architecture and technology you are working with. If a user has input some bad data that I didn't think to validate (user inputs 11 in a page number field, when there are 5 pages in total), an operation is initiated on that data (user presses the 'go' button), and that operation is known to be stateless, when it encounters an error (ArrayIndexOutOfBounds), we can safely abort the operation, log the stack trace, and signal an 'internal server error' to the user without having to kill the whole process.
Not to mention, in a program with many non-transactional state changes, dumping core could be a source of persistent corruption in itself, if another thread was doing something as simple as, for example, writing a JSON file.
3 replies →