Comment by fwsgonzo
5 years ago
Have you considered trying to make system calls inline to avoid the call overhead? Or already thought about it?
I've been banging my head against the wall of trying to make a micro KVM guest and the glibc startup code uses inline SYSCALL everywhere, I assume to make things faster. It does seem to call CPUID a lot too, but I guess it's not as expensive as having to make full-blown system calls.
Cosmopolitan defines linkable symbols for all the __NR_syscall constants so you can absolutely use inlining if you like to live dangerously. For example: https://github.com/jart/cosmopolitan/blob/91f4167a45f811fde6... There really isn't any performance advantage to doing this, because the SYSCALL instruction itself has something like a 2000+ cycle cost due to all the copying that needs to happen on the kernel side along with things like spectre mitigations which have made is slower. So the 8 cycle cost of having a normal read() function wrapper is the wrong thing to be focusing on. Cosmopolitan aims to address the performance cost of SYSCALL by making it possible to run your binary on bare metal, where your program becomes a kernel, and therefore needn't pay any context switching costs at all.
And it's even more overhead when exiting a VM, then?