The Downsides of Go's Goroutines

2 years ago (blog.djha.skin)

31 comments

djha-skin

The coloring of go functions is actually seen in the FFI. Go functions have a special calling convention so FFI is annoying.

This is similar to Cilk back in 1995 with their continuation-stealing work-stealing scheduler. The C++ committee made an in-depth review of fibers (stackful coroutines that install their own stack).

Regarding scheme, the issue is that they have multishot undelimited continuations, which let to several debates. Delimited continuations are enough to express everything but non-determinism (which causes the lifetime issues, i.e. you can enter a room once and exit twice), in practice you allow continuations to be moved (on different executors like a state machine, threadpool or actors) but not copied.

Cilk: http://supertech.csail.mit.edu/papers/PPoPP95.pdf

Fibers under magnifying glass: https://open-std.org/jtc1/sc22/wg21/docs/papers/2018/p1364r0...

on Scheme undelimited continuations: https://okmij.org/ftp/continuations/against-callcc.html#dyna...

In general, I've collected experiments and criticisms from many many language on continuations, coroutines, fibers, cancellation here: https://github.com/mratsim/weave-io-research/blob/master/res...

kevingadd 2 years ago

Externally canceling a task at a location other than a known stopping point is used as an example here, but in most environments doing this is a known-bad design decision, since the terminated thread-or-task might have been holding a mutex, and now that mutex is stuck closed forever. .NET has been closing the door on this primitive for years (https://learn.microsoft.com/en-us/dotnet/core/compatibility/...)

mrkeen 2 years ago

Haskell does this.
Threads can cancel other threads.
ResourceT and bracket are two ways for a thread to register clean-up code in the event that they are cancelled.
verdagon 2 years ago
It makes me wonder if there are any language constructs that can make this a reasonable feature. One idea I've been tossing around is having the ability to roll back any changes to mutex-guarded data if an exception drops a mutex guard. It should be possible with the right language constructs and bookkeeping.
Perhaps there are other mechanisms out there too.
I feel like the ability to destroy another thread isnt inherently bad, just... bad with today's languages. Just a feeling though.
- cratermoon 2 years ago
  
  The Go folks will repeat the aphorism, "Do not communicate by sharing memory; instead, share memory by communicating."[1]. The author directly violates the intention of the designers of Go by talking about shared file handles and other data structures, i.e. memory.
  The word "channel" doesn't appear a single time in the article, even though goroutines without channels to communicate with each other should never be sharing data. Channels are the synchronization primitive in Go.
  1. https://go.dev/blog/codelab-share
  
  2 replies →
- EdiX 2 years ago
  
  It seems to me that what you are describing is usually called software transactional memory. It has its own set of problems (bad performance with high granularity and livelocks, although you can probably avoid livelocks if you only care about using it for abnormal terminations) but it doesn't fully resolve the problem here. Yes, not leaving memory in an invalid state goes a long way but any form of IPC is potentially problematic: consider what happens if the thread is writing to a socket borrowed from a pool, or to a disk file.
  Not impossible to deal with but everything you do needs to be designed with cancellation-at-any-point in mind, it doesn't seem worth it to me.
- jayd16 2 years ago
  
  SQL comes to mind.
djha-skin 2 years ago
That mutex gets cleaned up because thread interruption is implemented as an exception, so the `finally` block would be able to take care of the mutex.
- kevingadd 2 years ago
  
  this is only true if the mutex is guarded by a finally block. your thread could be calling anything, including user-space code written in C that holds a mutex.
jayd16 2 years ago
It's not so much that exceptions are bad though. Cancellation tokens throwing is not obsolete. Mutex and such can be cleaned up in finally blocks in .NET. It's just that being able to place the exception in a predictable place has benefits.
- bb88 2 years ago
  
  Python/C++ has finally: blocks as well.
  They're not nearly as good as the `with` statement (or context handler) in python.
  
  2 replies →

t8sr 2 years ago

To me, much of this reads like the author wants to do something bad, the language doesn’t let him, and he views that as a pain point. For example, killing a thread at an arbitrary line is a terrible idea, and the fact that Go doesn’t have that feature is probably for the best.

Some other points show a lack of understanding. In Java, you also can’t catch exceptions from another thread. Go panics are mostly equivalent in this sense - you get a defer (finally) block, which is what supposedly matters to you for cleanup, and you must have a recover (catch) within your thread.

Overall I’m not a fan of how this is presented as some kind of deep thinking, when in fact it’s so many surface observations and misunderstandings about how things work.

djha-skin 2 years ago
In Java you can set an uncaught exception handler, so you can make sure exceptions are caught in child threads from the parent.
In Go, defer blocks are only run for the panicking goroutine. Other goroutines do not run defer, the program simply crashes.
- t8sr 2 years ago
  
  Sure, if the program crashes, all bets are off. That’s also true of any other similar exception mechanism.
  The fact that the program crashes on unhandled panic, rather than letting you install a default global handler, is a design choice that has nothing to do with how goroutines are implemented.

EdiX 2 years ago

What a strange article. Goroutines definitely do have their downsides, they aren't hard to find either, but nothing he talks about here is actually a downside of goroutines. Also for some reason almost every reference he has is from 2016.

> Further, goroutines' lack of their stack ancestry means they can't natively give out the nice stacktraces found in other languages, though there are a lot of workarounds, to be sure.

I wish he was more speific here, what's "nice" about stacktraces in other languages that is lacked by Go? I really don't get it, it seems to me that they are the same, maybe I'm missing something. There's even a mechanism to record goroutine ancestry [1] but it's from 2018, past this article's knowledge cutoff date maybe?

> Because of this stack disconnection of parents and children, it really becomes impossible to have exceptions

But panics use the exact same mechanism as exceptions? They are not used in the same way as exceptions but it isn't because of any technical limitation of goroutines, it's mostly about what's idiomatic in the language.

His whole example about a panic causing a resource leak, as far as I can tell the difference is that in java an uncaught exception will only close its thread while in go the whole program crashes. You can definitely argue for both behavior as being better (IMO it really depends on the application), but it doesn't have anything to do with how goroutines are implemented.

> This highlights a key difference between cooperative scheduling and OS-level scheduling: OS-level threads can be stopped at any time, while cooperatively scheduled coroutines cannot

Goroutines are actually scheduled preemptively since 2020 [2], again this is probably beyond the knowledge cutoff date of the article. This isn't surfaced to the user however, you still can't stop a goroutine at an arbitrary point. But it doesn't have anything to do with the way goroutines are implemented, it's because it is a design decision to not allow users to do that.

[1] https://github.com/golang/go/issues/22289 [2] https://go.dev/doc/go1.14

nunez 2 years ago

> If the programmer clones the callstack, hands the callstack off to a coroutine, and then both callstacks have e.g. a reference to an open file handle in one of their stack frames, it means the programmer cannot safely unwind the stack.

Wouldn't you access file handles like this through a synchronization primitive like a mutex to avoid this exact issue? Or close the file handle once all go routines have returned?

> Because of this stack disconnection of parents and children, it really becomes impossible to have exceptions. Not just won't, as Rob Pike would have you believe. Can't.

That's one thing I actually really like about Golang. Exceptions have always felt like this heavy thing that you, often times, use, like, 10% of. The error type is so much lighter.

tommiegannert 2 years ago

> Further, goroutines' lack of their stack ancestry means they can't natively give out the nice stacktraces found in other languages,

There's GODEBUG=tracebackancestors=N since 2018.

https://pkg.go.dev/runtime

> 3. Goroutine b panics, crashing the program. > > 4. The database connection is then left open as a zombie TCP connection. > > [---] Because of this stack disconnection of parents and children, it really becomes impossible to have exceptions.

I don't follow this reasoning. If goroutine b panics, the process dies and the TCP connection is closed by the OS. If you handle the panic in b, there's no zombie connection, since both a and b are still alive to handle the connection.

Exceptions are just return statements with pattern matching on stack unwind. Nothing special. I can't figure out what the author sees in them that I don't. The Go defer statements are perfectly capable of handling lifetimes, and a common situation is that you create a goroutine just to manage a resource's lifetime, e.g. when you have multiple producers and you need to close the channel after all of them are done.

> In Go, only calls made by the offending goroutine can recover from a panic, while in Java, the parent caller/creator of the new thread can itself set a recovery mechanism.

And in Go, the caller can insert a stub function that handles recovery before calling the main goroutine function. This seems equivalent to me.

> He says the mainframers at his company hate UnixODBC for this reason, it tends to leave zombie connections behind. (Go uses UnixODBC

I don't know how his dad's mainframe handles terminated processes, but this needs clarification on why they're left: the OS really should take care of closing those connections.

> Consider the problem of a task scheduler cancel button. The program must schedule a task to be run on an agent. At any given time, however, the task must be able to be cancelled.

Fair enough, that requires extra adaptation to Go (what the author calls workarounds.) It will never be as nice as simply killing a preemptive thread.

> In Java, it is child's play to interrupt a running thread.

And now the application programmer has to handle far more cases of partially complete tasks instead. The number of strawman arguments in this post is astounding.

bb88 2 years ago

> (If they did, the unwinding would close the file handle and the other coroutine might then return to that stack frame, only to find the file handle already closed.)

The programming language is not supposed to keep state of OS level resources. You are. The gray beard in me would just say "Duh".

To my knowledge, python/C/C++/Java/Go doesn't do this automatically. You're responsible for that.

jayd16 2 years ago
Why not?
- bb88 2 years ago
  
  I presume you're the one who downvoted my comment.
  C++ leaves OS level resource handling to the user. It's not C++'s job to figure out if the user wanted to close this unused resource. It's the programmer's job to not leak resources. And that includes closing each file descriptor once, or deleting allocated memory exactly once.
  Go came with a GC model. Yay! But file descriptors aren't pointers to memory. They're representations of physical objects on disk.
  Again, it's the programmer's job to tell the compiler when they're done with the file, and not rely on the compiler to figure out when they're actually done with it.
  Could go reference count file descriptors/handlers? Yes! But they also chose not to.
  
  5 replies →

fithisux 2 years ago

Very good discussion. Still, exceptions are dangerous.

https://pianomanfrazier.com/post/exceptions-considered-harmf...

John_R_S 2 years ago

It seems that of the information presented is wrong. For example:

Go's goroutines are not coroutines. Coroutines are another form of concurrency, which: . shares a stack and scope . uses "split" and "join" . generally is limited to run on the same thread, not in parallel . switches cooperatively, not preemptively NONE of these are true for goroutines. Also: . Goroutines do not have a parent-child relationship. This would limit their usefulness. If you need it, there are 3rd party libraries which encapulsate and add that functionality. . There is currently a proposal to add coroutines to Go. They are lighter than goroutines and are handy in certain cases. . Note: Initially goroutines could be preempted only when making a function call. As of Go v1.14, about 4 years ago, full preemption was added.

Synchronization is possible many ways: . "channels" . "select" statement . mutexes, semaphores, and all the common atomic operators. Check the "sync" lib. . Go can certainly use the above techniques based on "real time". Ex: Ticker in the "time" lib.

Go doesn't have "exceptions". It has always had "panics", which are almost the same. . Go philosophy is to handle errors where they occur. . Panics are used by the compiler for non-recoverable runtime errors. . Users can generate panics too, if they choose to. . Panics can be trapped by any function in the call path by using "recover". . Recover, besides executing whatever it cares to, can allow the panic to continue upward or cancel it. . So you can kludge a general exception handler, but it's generally best not to.

Go doesn't use an "event loop" to schedule goroutines. . It does schedule goroutines, but based on multiple criteria. . Scheduling is done independently for each process (fiber).

Having to worry about multiple goroutines trying to close the same file is generally a red flag for poorly structured code. Regardless, there are PLENTY of ways to solve it. Here's two: . Use a synchronized variable to save the open/closed state. . Use a channel for anyone to submit a close request. The receiver, which runs in its own goroutine, can track the open/closed state and act accordingly.

UnixODBC: Perhaps I missed it, but I don't see where it is used in Go itself or in the standard library. I see many SQL interfaces, none of which mention UnixOBDC.

Cancelling a "task": Indeed, Go itself gives no way for one goroutine to cancel another. But setting that up is very easy. You would use a channel or a shared variable to signal the task you wish to cancel. It would have to look for the signal, of course. Also, there are 3rd party libraries which can do this for you. And yes, you could put this feature in a panic (exception) handler, too.