Views on Error Handling

5 years ago (dannas.name)

92 comments

dannas

The very concept of "error handling" is absurd.

There are no errors, just unnecessary abstractions and control flow hacks. You try to open a file; either you can or you cannot, and both possibilities are equally likely and must be handled by normal control flow in your program. Forcing an artificial asymmetry in the treatment of both cases (as championed by the error handling people) adds ugly complexity to any language that tries to do so.

gary_0 5 years ago
The problem of error handling arises due to type systems and function call semantics.
data = open("/var/foo")
Most programmers are going to expect 'data', the result returned by 'open', to be a type that allows reading/writing the file. The programmer expects /var/foo to exist, or they might have checked before calling 'open', but even that's not foolproof.
Historically, a failure might just set 'data' to an invalid value (like 0 or null) but that ended up being a bad idea. And we needed some way to return more information about the error. So we started doing this:
error = open(data, "/var/foo")
But this mainly just complicated things. Is 'data' input or output? The function doesn't return its actual output. And it's still possible to ignore 'error', so 'data' is still potentially undefined.
Then exceptions were invented so we could use proper function call styles again, and the program wouldn't go into an undefined state. Instead, the error could be handled with separate logic, or the program would halt if it was ignored. This was far from a perfect solution, though.
Then sum types entered the mainstream, so 'data' had well-defined ways of returning something other than the expected result. But that resulted in a lot of competing conventions and stylistic decisions for what to do when 'data' is an error type, that haven't quite been settled yet.
- jjjensen90 5 years ago
  
  Again referring to Go's imperfect but interesting handling of this problem, the style in Go would be:
  data, err := open("/var/foo") if err != nil....
  In fact, the compiler will make you deal with both values after assignment unless you explicitly ignore one of the return values.
TeMPOraL 5 years ago

This way a combinatorial explosion lies.
Each library/system call you do can result in a set of possible consequences. We usually don't care about them equally, though: in fact, in the file example, 99% of the time we care about whether the file was opened or not - and in the latter case, we don't need to know why. So the asymmetry is already introduced by the intent of the program - there's usually only one path of execution we want; other ones are distractions. Error handling exists to express that asymmetry of caring at tool level.
wvenable 5 years ago
I completely disagree. A program or function is designed to perform an operation. If that operation requires the contents of the file, then the program cannot continue unless it successfully reads the contents of the file. There is already a natural asymmetry. If you cannot open the file, there isn't any more to do.
- Mikhail_Edoshin 5 years ago
  
  An "operation" is not something inherent to the code. If we look at a function that may get what we call an error, we'll see that in either case it completes and returns control to the caller. We label one such result 'success' and another 'failure' because we also have an idea of purpose of the function, but the purpose does not exist at the code level. Maybe this is why we struggle with errors.
  
  6 replies →
- enriquto 5 years ago
  
  > I completely disagree.
  Disagreeing is alright, but here you don't really do, do you? I can translate the paragraph you have written into pseudocode:
  > If that operation requires the contents of the file, then the program cannot continue unless it successfully reads the contents of the file. (...) If you cannot open the file, there isn't any more to do.
  if (file opening fails): stop doing things else: continue with your operations
  This is just a regular "if-else" that can be done with any programming language. The behavior of your program when the file cannot be opened is part of the specification; just as its behavior when it can be opened. I agree with you on that, and I add that the desired behavior can always be implemented using regular control flow constructions. You do not need a specific language construct for "errors", as you have proven by the algorithm that you have described in your text.
  
  1 reply →
renox 5 years ago

Ok, then replace all additions in your program with a function returning either an error or the result. Same with logging statements. And don't ignore the errors.. Is your program still readable?
camgunz 5 years ago

I agree. You don’t open a file, you try to open a file. When you get a handle back, that’s your library skipping steps.
dllthomas 5 years ago
Can you name a language/ecosystem that gets it right?
- dinosaurdynasty 5 years ago
  
  Rust seems pretty close to this (with Result<T, E>), though there is also the panic system.
  
  5 replies →
- jjjensen90 5 years ago
  
  Though I don't think it's perfect, languages like Go that treat errors as simply another type and you check as a return value do get closer to treating errors and success symmetrically versus an exception throwing and typing system like some other languages.
  
  1 reply →
- vvanders 5 years ago
  
  Erlang / OTP.

phoe-krk 5 years ago

There seems to be no mention of the Common Lisp condition system, which allows for handling of exceptional situations actually done right. Is this omission purposeful?

See https://news.ycombinator.com/item?id=23843525 for a recent long discussion about my upcoming book on the topic. (Disclosure: this is a shameless plug.)

chrismorgan 5 years ago

Conditions are certainly technically fascinating. I was introduced to them back when Rust used them for I/O error handling. But Rust ~0.8 dropped conditions, because people found them much more confusing than Result<T, E>-based error handling, and almost no one was actually using any of the power of conditions.
Broadly speaking, conditions can be implemented as a library feature, so you can reintroduce such things in cases where the extra functionality is compelling (though now users won’t be familiar with it, so it’ll be much harder to justify).
Other programming languages have been tending in the direction of implementing generators and async/await, which can be used to more smoothly implement some of the key concepts of conditions. (They’re not the same by any means, but related.)
dannas 5 years ago
I've collected references to error handling but - I have to shamefully admit - have never encountered Common Lisp's condition system.
I'll take the time to read up on it properly, but from a quick glance it seems to me to be in the category of non-local transfer of control with a co-routine flavour.
It looks powerful, but I get the sense that a lot of language designers are on purpose trying to restrict the powers of error handling. So returning sym types or error codes are simpler than throwing exceptions which - again looks to me - to be simpler than allowing transfer of control to be decided at run time as in the condition system.
Again, very interesting. And thank you for making me aware of its existence.
- phoe-krk 5 years ago
  
  > with a co-routine flavour.
  Kind-of-but-not-exactly. There are no coroutines whatsoever; the main technical defining point is that the stack is not unwound when the error happens, but it is wound further. Some code is executed that then searches the dynamic environment for matching error handlers, which are executed sequentially; these are then capable of executing arbitrary code provided from earlier in the stack in form of invoking so-called restarts; both handlers and restarts are also capable of performing transfers of control to any point on the stack that was properly annotated as being available for such.
fmakunbound 5 years ago

I thought the same thing. If you're surveying error handling approaches, you've got to include Common Lisp's condition system with it's out of band signals and restarts and so on.
pjmlp 5 years ago
There is now a book about them.
https://www.apress.com/gp/book/9781484261330
- phoe-krk 5 years ago
  
  The HN link discussing that book is literally what I linked above!
  Signed, the author of that book. :)
  
  1 reply →

gumby 5 years ago

> but without the downsides of the costly C++ memory deallocation on stack unwinding.

I.e. I don’t care about restoring the program to a known state when handling an error (memory deallocation is just one case of processing unwind blocks; locks need releasing,my file handles returned to kernel etc). This really only makes sense when your error “handling” is merely printing a user friendly error message and exiting.

psykotic 5 years ago
(I'm the person he was quoting in the article.)
When I use setjmp/longjmp error handling I almost always want abort semantics but at the library level rather than at the OS process level. [1] Where applicable it's the simplest, most robust model I know. You have a context object that owns all your resources (memory blocks, file handles, etc) which is what lets you do simple and unified clean-up rather than fine-grained scoped clean-up in the manner of RAII or defer. You can see an example in tcc here:
https://github.com/LuaDist/tcc/blob/255ba0e8e34f999ee840407c...
https://github.com/LuaDist/tcc/blob/255ba0e8e34f999ee840407c...
[1] It goes without saying that a well-written library intended for general use is never allowed to kill the process. This presents a conundrum in writing systems-level C libraries. What do you do if something like malloc fails in a deep call stack within the library? Systems-level libraries need to support user-provided allocation functions which often work out of fixed-size buffers so failure isn't a fatal error from the application's point of view. You'd also want to use this kind of thing for non-debug assert failures for your library's internal invariants.
This style of setjmp/longjmp error handling works well for such cases since you can basically write the equivalent of xmalloc but scoped to the library boundary; you don't have to add hand-written error propagation to all your library functions just because a downstream function might have such a failure. I'm not doing this as a work-around for a lack of finally blocks, RAII or defer statements. It's fundamentally about solving the problem at a different granularity by erecting a process-like boundary around a library.
- gumby 5 years ago
  
  See my response to a parallel comment from dannas.
  I can see some minor corner cases where it could be worthwhile but the mental overhead isn't worth it.
  I've written plenty of realtime code but spending a lot of time on the code running in the interrupt handlers is mentally exhausting and error prone; I do that when I have no choice. Likewise I've written a lot of assembly code but it's been decades since I wrote a whole program that way -- I don't have enough fingers to keep track of all the labels and call paths.
  E.g. just because c++ has pointers doesn't mean I use them very often. >90% of the cases can be references instead.
dannas 5 years ago
More context to that quote: >Per Vognsen discusses how to do course-grained error handling in C using setjmp/longjmp. The use case there were for arena allocations and deeply nested recursive parsers. It’s very similar to how C++ does exception handling, but without the downsides of the costly C++ memory deallocation on stack unwinding.
I have never used setjmp/longjmp myself. And I agree with you that my first instinct would be to use it in the similar manner as in many GUI programs: they have have a catch statement in the message loop that shows a dialog box of the thrown exception. You just jump to a point where you print a user friendly error message and exit.
But I still can imagine use cases where you've isolated all other side effects (locks, shared memory, open file handles) and are just dealing with a buffer that you parse. Has anyone used setjmp/longjmp for that around here?
Given your many years in the field and Cygnus background I guess you've used it a few times? Do you happen to have any horror stories related to it? :-)
- gumby 5 years ago
  
  I hate setjmp/longjmp and have never needed it in production code.
  Think about how it works: it copies the CPU state (basically the registers: program counter, stack pointer, etc). When you longjmp back the CPU is set back to the call state, but any side effects in memory etc are unchanged. You go back in time yet the consequences of prior execution are still lying around and need to be cleaned up. It's as if you woke up, drove to work, then longjmped yourself back home -- but your car was still at work, your laptop open etc.
  Sure, if you're super careful you can make sure you handle the side effects of what happened while the code was running, but if you forget one you have a problem. Why not use the language features designed to take care of those problems for you?
  This sort of works in a pool-based memory allocator.
  The failures happen three ways: one is you forget something and so you have a leak. The second is that you haven't registered usage properly so have a dangling pointer. Third is by going back in time you lose access to and the value of prior and/or partial computation.
  If you use this for a library, and between the setjmp and longjmp is entirely in a single invocation you can sometimes get away with it. But in a thing like a memory allocator where the user makes successive calls, unless you force the user to do extra work you can't be sure what dependencies on the memory might exist. If your library uses callbacks you can be in a world of hurt.
  Trying to keep track of all those fiddly details is hard. C++ does it automatically, at the risk of potentially being more careful (e.g. deallocating two blocks individually rather than in one swoop -- oh, but that language has an allocator mechanism precisely to avoid this problem). The point is the programmer doesn't have to remember anything to make it work.

platz 5 years ago

> Composing Errors Codes ... Instead of sprinkling if statements, the error handling can be integrated into the type ... The check for errors is only done once.

That is only a superficial level of composition, if one can call it that at all, that doesn't account for actual composition of errors of different types. The example provided is just encapsulation and therefore orthogonal to the issue of error handling approaches. i.e. in the example, the error handling code is only centralized, not composed.

dannas 5 years ago
Can you make the distinction between "centralization" vs "composition" of errors?
Do you mean the fact that there must be some if-statement within the API that reacts to the different errors and sets a flag used by the Err() method?
Is you opinion that "composition of errors" always requires special syntactic elements such as the match statement?
The code from the blog section:
scanner := bufio.NewScanner(input) for scanner.Scan() { token := scanner.Text() // process token } if err := scanner.Err(); err != nil { // process the error }

nemothekid 5 years ago

The only downside of error codes via Sum types (Rust) seems to be, according to the article, is performance. It then claims that Checked Exceptions are the solution (at least according to Joe Duffy).

Maybe I'm naive to how exceptions are actually implemented, but it seems to me that both a checked exception and Sum Type would incur the same overhead, a single branch to make sure things haven't exploded.

barrkel 5 years ago

If you want to treat your error result as a first class value, and transport it around, then your sum type can't use the same implementation as exceptions, which can use data-driven stack unwinding to have zero cost in the success case, the data being generated by the compiler and consumed by a stack unwinder after it has been invoked by an exception raise.
saurik 5 years ago

As exceptions are an abstraction you can implement them in many ways; one of those is "the same secondary return code error check as you would do manually", but if you assume "errors are extremely rare" (which I assert is fair: people who disagree generally point to a tiny class of things that I would argue aren't errors in the first place, such as "key not found in map" and "file not found on disk") you can use implementations that have literally "zero cost" for success but instead compile all of the exception unwind logic (catch statements and deconstructors) as continuations of the original functions (causing some modest binary code bloat, though the compiler can sometimes avoid this being noticeable) and then do table lookups (which are slow, but not necessarily ridiculous) to find the right on-error unwind target given an on-success function return pointer.
Essentially, I would argue that error signaling is important enough and common enough that it deserves attention by the compiler in the same sense that many of the other things we provide syntax for (such as traits or inherentence) are things which developers can type naive manual implementations of with basic tools (such as switch statements or dictionaries or dragging around lots of function pointers), but if you can abstract it in a way such that the semantics are available to the compiler you can come up with much better ways to handle the problem (such as vtables or polymorphic dispatch caches) for a given set of tradeoffs (such as low memory usage, low construction cost, consistent latency, etc.). If everyone is implementing the feature themselves manually in the code then you have lost any real ability to make great optimizations.
(Note that you don't necessarily have to have it be syntax to do this: you can also have a language such as Haskell--where notably these Either-style errors are usually cited as being from--where they do it in the language but abstracted everything an additional level higher, letting you define a lot of these flow control concepts in terms of a monad, so then downstream users use "do" notation to feel like custom syntax and the monad's bind operator provides a central chokepoint on what was otherwise a bunch of boilerplate. You sometimes--not always--can then do optimizations across the entire program of that shared abstraction. The way languages like Rust and Go are handling this, without support for monads, simply precludes anything other than attempts at reverse engineering semantics from the code, which is ridiculous.)

identity0 5 years ago

The obvious solution (in C++) is not to use exceptions at all, but make your own `error` and `expected<T>` class, and just add [[nodiscard]] to them. All the benefits of Go-style errors, you’ll never forget to check the error, and there is very little runtime overhead. If you pass the error as an out parameter then there is zero runtime overhead on success.

dannas 5 years ago
Speaking of C++ exceptions: Andrei Alexandruesco has investigated the performance impact of replacing exceptions with error codes. Dave Cheney made a summary of Andreis points in https://dave.cheney.net/2012/12/11/andrei-alexandrescu-on-ex...
* The exceptional path is slow (00:10:23). Facebook was using exceptions to signal parsing errors, which turned out to be too slow when dealing with loosely formatted input. Facebook found that using exceptions in this way increased the cost of parsing a file by 50x (00:10:42). No real surprise here, this is also a common pattern in the Java world and clearly the wrong way to do it. Exceptions are for the exceptional. * Exceptions require immediate and exclusive attention (00:11:28). To me, this is a killer argument for errors over exceptions. With exceptions, you can be in your normal control flow, or the exceptional control flow, not both. You have to deal with the exception at the point it occurs, even if that exception is truly exceptional. You cannot easily stash the first exception and do some cleanup if that may itself throw an exception.
- josefx 5 years ago
  
  > You cannot easily stash the first exception and do some cleanup if that may itself throw an exception.
  You can stash/rethrow exceptions since c++11 with an exception pointer if you really need to.
  https://en.cppreference.com/w/cpp/error/exception_ptr
asdfasgasdgasdg 5 years ago

There is still runtime overhead in that you have to check whether you succeeded. The best possible scenario is if the error source knows exactly what code to jump to in the error case, and the calling code can assume that no error occurred if it is running. So in that sense it can be done better. But I'm not sure how material this difference is in light of correct branch prediction in the success path.

ridiculous_fish 5 years ago

> Swift does not AFAICT provide mechanisms for enforcing checks of return types

Swift does this by default! You have to annotate (via @discardableResult) those functions which should not warn.

But of course try/catch is used in Swift more often.

dannas 5 years ago
Oh, that was sloppy of me. I should have read up more on Swift (I've never used it myself).
While I have your attention: A big thank you for Fish shell!
And related to the current subject: How does fish handle errors? A quick skim found some constants that are returned upon failure, such as this case for disown: https://github.com/fish-shell/fish-shell/blob/master/src/bui...
What trade-offs did you face when designing error handling for your shell?
- ridiculous_fish 5 years ago
  
  Thank you for the great article! You ask a good question.
  Shells are rarely CPU bound, so some perf overhead is acceptable. But shells may be used to recover badly broken systems. If fork or pipe fails, most programs are OK to abort, but a shell may be the user's last hope, so has to keep going.
  For example, if pipe() fails, it's probably due to fd exhaustion. If your system is in that state, the best thing to do is immediately unwind whatever is executing, and put the user back at the prompt. fish uses ad-hoc error codes (reflecting its C legacy) instead of exceptions, though it uses RAII for cleanup. Your question made me realize that fish needs a better abstraction here; at least use `nodiscard`.
  The story is different for script errors [1]. If the user forgets to (say) close a quote in a config file, fish will print the line, a caret, and a backtrace to the executing script. A lot of effort has gone into providing good error messages with many special cases detected. The parser also knows how to recover and keep going; I think Fabien would approve.
  1: https://github.com/fish-shell/fish-shell/blob/225470493b3cd1...
  
  1 reply →
saagarjha 5 years ago

> But of course try/catch is used in Swift more often.
FWIW I find actual exception usage rare aside from automatic error out parameter to exception conversion by the Clang importer when bridging to Objective-C code.
TeMPOraL 5 years ago

While we're on the topic - C++ doesn't do that by default, but since C++17 you can enable enforcing it on a case-by-case basis - you can mark functions, or even enums and structures, as [[nodiscard]], and then the compiler will issue a warning if you don't use the return value of that function (or whatever function that returns a class or enum marked as [[nodiscard]]).

ensiferum 5 years ago

There are 3 separate things that each require their different approach.

- errors i.e bugs made by programmer - logical "error" conditions that the program is expected to handle for example network connection failed or user input failed - unexpected error conditions that typically boild down to resource allocation errors, socket could not be allocated or memory allocation failed etc.

In my experience all of these are best handled with a different tool.

For bugs use an assert that dumps the core and leaves a clear stack trace. For conditions that the program needs to handle use error codes. And finally for the truly unexpected case use exceptions .

tsimionescu 5 years ago
Why dump core when you can log the bug and continue? Sure, in development we want things to fail fast and loud, but when deployed with a customer, I don't want my whole program to crash because there is one obscure code path that has a problem.
And even for conditions that the program is expected to handle, 99.9% of the time all it can do is notify the user and ask for guidance, which means that the error must be bubbled up from a networking or storage layer all the way to the presentation layer - a perfect task for exceptions or something like an error monad.
The only problem with exceptions or error monads is that they get tricky in the presence of resources that need to be released, and even that is well handled with patterns like RAII.
- wvenable 5 years ago
  
  > Why dump core when you can log the bug and continue?
  I see from your replies what you're trying to say. If an error occurs, most likely you want the entire operation to abort -- that doesn't necessarily mean the whole program depending on the program.
  For example, if I have a GUI app and the "save" operation fails and I typically roll that back right to the event loop of the application and the user gets an error and they can retry the save.
  For other types of applications, killing the whole process is ending the operation.
  
  1 reply →
- dllthomas 5 years ago
  
  > I don't want my whole program to crash because there is one obscure code path that has a problem.
  If that one obscure code path corrupted my state, I want to limit the incorrect actions that the software takes based on that state.
  This "want" of mine is to be balanced with all the other things I want out of the program, and the relative weights will vary by context... but it is often the case that continuing erroneously risks more harm than simply falling over.
  
  6 replies →
- ensiferum 5 years ago
  
  Because correctness is important to me. I don't want my programs to silently go about in a buggy state producing incorrect results in a corrupted state.
  
  6 replies →
identity0 5 years ago
For the third case it’s better to just abort. Tell the user to get more RAM or something. What are you supposed to do when you’re out of memory? Catch the exception? Then what?
Related, I always find it funny when C programmers write `if (malloced == NULL) return NULL;` Either you’re going to forget that this can happen and dereference null (in which case it’s just better to abort the program immediately) or the caller will check this and then close the program. If it doesn’t, the next malloc will be null anyways, and the problem repeats. Just call abort().
- ensiferum 5 years ago
  
  Well memory failure checking is usually put in the "can't do shit" category which isn't necessarily true. Both in C and in C++ bad_alloc or null from mallon indicate that the memory manager could not find the memory. This may or may not mean that your application has overcommitted memory in the OS level. Completely depends on the actual memory manager. So therefore the failure to me is just a general resource allocation failure. Would you dump core of your program failed to allocate a socket ? Or mutex?

gumby 5 years ago

An important paper on the trade offs: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p070...

bestouff 5 years ago

Rust shines when doing error handling. No way to ignore errors, but properly handling them often adds just a single question mark to your code. Everything stays readable and lightweight.

Of course the error handling story is still perfectible but so far it's already one of the best I know.

throwaway894345 5 years ago
The trouble I’ve had as a beginner is crafting error types for those “Union of multiple existing error types”. E.g., myfunc() can return an IO error or a Parse error. The boilerplate for creating a new error type (absent macros) is significant, and while I’m aware that there are macro crates available which automate this, it’s not clear to me which of these if any are “most standard” or otherwise how to distinguish between them.
- Celeo 5 years ago
  
  There are many ways to do it, like you said. Over time, the most popular options have shifted as new support from the standard library arrived. How you handle the errors can boil down to whether or not you really care about what kind of error it is, or just if an error occurred.
  Two popular crates for handing these situations are thiserror [1] and anyhow [2], for handling errors by type and handling all errors, respectively.
  There are additional ways, like just returning a Box wrapper around the stdlib error type [3], or just unwrapping everything. It depends on what your program needs.
  [1] https://crates.io/crates/thiserror
  [2] https://crates.io/crates/anyhow
  [3] https://play.rust-lang.org/?version=stable&mode=debug&editio...
- miccah 5 years ago
  
  As a beginner to Rust, this blog post has been excellent, and it has really helped me understand the idiomatic way to handle errors.
  https://blog.burntsushi.net/rust-error-handling/
Thaxll 5 years ago
You can "ignore" error is Rust using _ like in Go.
- Groxx 5 years ago
  
  Not really. In Go you can `val, _ := func()` and use the value even if there is an error. AFAIK there is no equivalent in Rust (for Option) outside of unsafe shenaniganry. You can choose to panic / return err / etc, but you can't choose to use the value regardless of the presence of an error.
  
  5 replies →
- ragnese 5 years ago
  
  That's still very explicit. If you don't bind the returned Result to something (`let _ = ...`), the compiler bitches at you.
- masklinn 5 years ago
  
  Go lets you ignore errors by just not binding them at all.

hifly 5 years ago

No discussion is complete without mention of Erlang’s view on this

https://erlang.org/download/armstrong_thesis_2003.pdf

haihaibye 5 years ago

Erlang's error handling is mentioned in the article. Maybe read it before posting?
neogodless 5 years ago

If it helps, the second section of the article is called
> The Erlang Approach - Let it Crash