← Back to context

Comment by tsimionescu

5 years ago

Why dump core when you can log the bug and continue? Sure, in development we want things to fail fast and loud, but when deployed with a customer, I don't want my whole program to crash because there is one obscure code path that has a problem.

And even for conditions that the program is expected to handle, 99.9% of the time all it can do is notify the user and ask for guidance, which means that the error must be bubbled up from a networking or storage layer all the way to the presentation layer - a perfect task for exceptions or something like an error monad.

The only problem with exceptions or error monads is that they get tricky in the presence of resources that need to be released, and even that is well handled with patterns like RAII.

> Why dump core when you can log the bug and continue?

I see from your replies what you're trying to say. If an error occurs, most likely you want the entire operation to abort -- that doesn't necessarily mean the whole program depending on the program.

For example, if I have a GUI app and the "save" operation fails and I typically roll that back right to the event loop of the application and the user gets an error and they can retry the save.

For other types of applications, killing the whole process is ending the operation.

  • Yes, exactly!

    And on the other end of spectrum, there are even systems where it makes sense to go further than killing a single process, and kill the whole container or even VM where a buggy condition was encountered.

> I don't want my whole program to crash because there is one obscure code path that has a problem.

If that one obscure code path corrupted my state, I want to limit the incorrect actions that the software takes based on that state.

This "want" of mine is to be balanced with all the other things I want out of the program, and the relative weights will vary by context... but it is often the case that continuing erroneously risks more harm than simply falling over.

  • That is indeed a very common need in a non-memory-safe language.

    Still, not all bugs have unbounded impact. As long as memory is not corrupted, thing like off by one errors and null pointer exceptions can often be safely recovered from by simply reverting the operation that hit the error until some kind of safe point (such as the last user interaction, or the last thread start).

    Edit: spelling.

    • "dime kind of safe point"?

      That's the whole idea of exceptions. Give up on some chunk of what you were doing, and recover to a previous state.

      This works best when you have some notion of a transaction, and can get back to the state before the transaction started. This is what "ROLLBACK" in SQL does.

      1 reply →

    • So you're suggesting basically writing by logic to deal with bugs i.e. For cases when the program has failed its own logical constraints? For me a bug is like a division by zero. The program violates its own logic and the only logical conclusion is the termination. I find the fixing is often much simpler and faster to have a loud but bang than some obscure unwanted incorrect result.

      2 replies →

Because correctness is important to me. I don't want my programs to silently go about in a buggy state producing incorrect results in a corrupted state.

  • Not all bugs put the whole program in a corrupted state. Especially if your prigram is written to be mostly stateless. A common pattern is that a state change is tried, it fails because of a bug or some other error, and it is reverted, and an error shown to a user. I would call this a robust program. Of course, not any error can be recovered this way, but it is very often possible (assuming we are taking about memory-safe languages; otherwise, the balance of probabilities is entirely the other way around).

    • In practice you always have mutated state in a complex system and not everything moves in transactional steps.

      People seem scared to dump core but I find that doing it makes my programs much robust and also simpler. I have no muddled logic to "deal with" bugs in the program itself. They simply will abort the program.

      4 replies →