Comment by bbkane
8 days ago
Does this work for Go binaries? My understanding is that Go programs do all the encryption "in the process" so the data is encrypted before eBPF can intercept it. I'd love to be wrong about that!
8 days ago
Does this work for Go binaries? My understanding is that Go programs do all the encryption "in the process" so the data is encrypted before eBPF can intercept it. I'd love to be wrong about that!
We have Go support, but it is not open sourced yet. Go is a bit more complicated but we were able to get it after some cave diving in the ELF formats. To give you a little insight on how this works, because Go is statically linked, we need to pull several different offsets of the functions we are going to hook into.
We do this by scanning every version of Go that is released to find offsets in the standard library that won't change. Then when we detect a new Go process, we use an ELF scanner to find some function offsets and hook into those with uprobes. Using both of these, we have all the information we need to see Go pre-encryption content as well as attribute it to connections and processes.
Great approach. I love the choice of practicality over generalization.
Are these offsets consistent across compilation targets, and they vary only by version of the Go binary? Or do you need to do this scan for every architecture?
The short answer is that we only have to calculate the offset per go version, no expensive runtime scanning is required.
The long answer is that the offsets are the byte alignment offsets for the go structs containing the pointers to the file descriptor and buffers. Fortunately we only have to calculate these for each version where the TLS structs within go actually change, so not even for every version. For instance, if a field is added, removed, or changes type then the location in memory where those pointers will be found changes. We can then calculate the actual offset at runtime where we know which architecture (amd64, arm64, etc) with a simple calculation. Within the eBPF probe, when the function is called, it uses pointer arithmetic to extract the location of the file descriptor and buffer directly.
1 reply →
I think you only need to use the eBPF approach for statically linked programs.
ISTR, at some point in the far past, using LD_PRELOAD with my own shims to capture TLS traffic before encryption/after decryption. I might have it lying around somewhere here.
Ok, that's exciting, and thanks for the insight!
Most programs do encryption without syscalls! eBPF can intercept userspace execution, which they do as mentioned in the post:
> The key idea is to hook into common TLS libraries (like OpenSSL) before encryption and after decryption
I saw that, but Go doesn't use dynamically linked libraries for encryption, so I don't think it helps in this particular case.
If I want to do something similar, do you know where the relevant parts of the eBPF docs are?
Qtap scans binaries of processes as well known locations for OpenSSL on startup, then passes the offsets to eBPF where it hooks into the SSL_read and SSL_write to get the content before or after it's been encrypted.
This is the eBPF side: https://github.com/qpoint-io/qtap/blob/main/bpf/tap/openssl....
The Go side which indicates what we are scanning for is here: https://github.com/qpoint-io/qtap/blob/main/pkg/ebpf/tls/ope...
For more docs on the topic: - https://docs.ebpf.io/ is a must read - https://eunomia.dev/en/tutorials/30-sslsniff/ has a tutorial on cracking OpenSSL open and getting the content as well. The tutorials they have are fantastic in general