← Back to context

Comment by phoboslab

6 years ago

It's surprising to me that you apparently have to fight for memory usage for these cases when using Go.

A while ago I ran a (quite naively written) nodejs application that maxed out at ~700k WebSocket connections per server - using only 4GB of RAM. Here CPU became the bottleneck.

Go's concurrency design trades off memory usage for productivity; instead of red-blue functions where you have to explicitly design for function interrupt/yield points with the async keyword, you can just write sequential code and the runtime will handle the rest. The downside to this approach is that often the stack will have to be copied during the switching process vs the stackless approach preferred by Node.js, Rust, C# etc.

See the excellent Fibers Under a Magnifying Glass paper by Microsoft Research: http://www.open-std.org/JTC1/SC22/WG21/docs/papers/2018/p136...

  • This is not how Go "works" overall, you're talking about the size of the goroutine stack which is by default 4KB, so in a scenario with a lot of connections yes it's going to add up if you use 1:1 connection / goroutine, but outisde of that Go uses less memory than Node / C# / Java / Python ect ...

    So I woudn't say "Go trades off memory usage for productivity" since Go is widely used for low memory footprint.

    Same reasons why Go makes sense in services like Kubernetes where each pods are in the range of 2 digits MB, it woudn't be possible whith languages mentioned above.

    Edit: In your edit context it makes more sense :)

    • That said, as an operator of a number of fairly large kubernetes clusters (100-500 nodes), I might prefer java's memory tuneability to Go's garbage collector simply not caring about the max memory a system can offer.

      At scale, we need to heavily oversize the kubernetes masters to deal with the spikiness of go memory consumption. It's possible for a kube apiserver to average 2GB of memory use and then jump to 18GB after suddenly handling more requests than usual and get OOM-killed. I'd much rather it simply slow down a bit than behave so erratically.

      This is a common thing across all Go programs that handle data: since Go's garbage collector doesn't try to keep memory use within some upper bound, if you allocate and throw away a bunch of objects you'll quickly run out of memory on any low-memory system or container even though there's a ton of memory ready to be reclaimed.

      Even a major library like the official AWS SDK's S3 client had this wrong as recently as 2017 (solution was to use sync.pool to avoid throwing away buffers): https://github.com/aws/aws-sdk-go/pull/1784

      Python should perform much better from a memory perspective in these situations: its weakness would be the deserialized size that objects blow up to in memory, but its use would be bounded. Java would also handle this pretty effortlessly, though with a high enough rate of garbage production might hit a few short GC pauses that last far less time than a process restart.

      Really wish go were better about this. Imagine that a simple Go cronjob that uploads a backup to s3 at under a 160MB/sec could risk you OOM killing your server if you don't set cgroup memory caps on all your go processes.

    • > where each pods are in the range of 2 digits MB, it woudn't be possible whith languages mentioned above.

      Not sure about NodeJS and Python, but it's certainly doable with Java and C#. It's just that people don't take the time to configure the JVM/CLR correctly.

      There's nothing magical about golang that you can't do in C# (and soon enough, in Java with the addition of value types). Arguably, C# and Java's value type implementations are superior anyway.

      4 replies →

  • Productivity in this case being in the eye of the beholder? I’d argue that people experienced in how node works, wouldn’t have to think too much, since async is the default. I agree with your general sentiment though.

You don't have to fight for memory, it's because of the overhead of Goroutines / default HTTP connections in a scenario with a lot of connections. By default Go uses way less memory than Node.

And in Node the only way to have semi decent performance is to use ultra optimized C/C++ external libraries.

CPU was probably the bottleneck from GC. You could tune the GC to get better results probably.