← Back to context

Comment by johncolanduoni

10 days ago

What do the benchmarks look like? My main concern with this approach would be that the performance envelope would eliminate it for the use-cases where C/C++ are still popular. If throughput/latency/footprint are too similar to using Go or what have you, there end up being far fewer situations in which you would reach for it.

Some programs run as fast as normally. That's admittedly not super common, but it happens.

Some programs have a ~4x slowdown. That's also not super common, but it happens.

Most programs are somewhere in the middle.

> for the use-cases where C/C++ are still popular

This is a myth. 99% of the C/C++ code you are using right now is not perf sensitive. It's written in C or C++ because:

- That's what it was originally written in and nobody bothered to write a better version in any other language.

- The code depends on a C/C++ library and there doesn't exist a high quality binding for that library in any other language, which forces the dev to write code in C/C++.

- C/C++ provides the best level of abstraction (memory and syscalls) for the use case.

Great examples are things like shells and text editors, where the syscalls you want to use are exposed at the highest level of fidelity in libc and if you wrote your code in any other language you'd be constrained by that language's library's limited (and perpetually outdated) view of those syscalls.

  • While there are certainly other reasons C/C++ get used in new projects, I think 99% not being performance or footprint sensitive is way overstating it. There's tons of embedded use cases where a GC is not going to fly just from a code size perspective, let alone latency. That's mostly where I've often seen C (not C++) for new programs. Also, if Chrome gets 2x slower I'll finally switch back to Firefox. That's tens of millions of lines of performance-sensitive C++ right there.

    That actually brings up another question: how would trying to run a JIT like V8 inside Fil-C go? I assume there would have to be some bypass/exit before jumping to generated code - would there need to be other adjustments?

    • > While there are certainly other reasons C/C++ get used in new projects, I think 99% not being performance or footprint sensitive is way overstating it.

      Here’s my source. I’m porting Linux From Scratch to Fil-C

      There is load bearing stuff in there that I’d never think of off the top of my head that I can assure you works just as well even with the Fil-C tax. Like I can’t tell the difference and don’t care that it is technically using more CPU and memory.

      So then you’ve got to wonder, why aren’t those things written in JavaScript, or Python, or Java, or Haskell? And if you look inside you just see really complex syscall usage. Not for perf but for correctness. It code that would be zero fun to try to write in anything other than C or C++

      3 replies →

    • Books like Zen of Assembly Programming exist, exactly because once upon a time, performance sensitive and C or C++ on the same sentence did not made any sense.

      It is decades of backed optimisation work, some of which exploring UB based optimizations, that has made that urban myth possible.

      As the .NET team discovered, and points out on each release since .NET 5 on lengthy blog posts able to kill most browsers buffers, if the team puts down as much work on the JIT and AOT compilers as the Visual C++ team, then performance looks quite different than what everyone else expects it naturally to be like.

      2 replies →

    • Chrome is a bad example. It uses a tracing GC in its most performance sensitive parts explicitly to reduce the number of memory safety bugs (it's called Oilpan). And much of the rest is written in C++ simply because that's the language Chrome standardized on, they are comfortable relying on kernel sandboxes and IPC rather than switching to a more secure language.

      4 replies →

    • I feel like code size, toolchain availability and the universality of the C ABI are more good reasons for why code is written in C besides runtime performance. I’d be curious how much overhead Fil-C adds from a code size perspective, though!

      1 reply →

    • Chrome is not a good counter example a priori. It is a project that has hundreds of engineers assigned to it, some of them world-class security engineers, so they can potentially accept the burden of hardening their code and handling security issues with a regular toolchain. They've may have even evaluated such solutions already.

      I think an important issue is that for performative sensitive C++ stuff and related domains, it's somewhat all or nothing with a lot of these tools. Like, a CAD program is ideally highly performant, but I also don't want it to own my machine if I load a malicious file. I think that's the hardest thing and there isn't any easy lift-and-shift solution for that, I believe.

      I think some C++ projects probably could actually accept a 2x slowdown, honestly. Like I'm not sure if LibrePCB taking 2x as long in cycles would really matter. Maybe it would.

    • Most C/C++ code for old or new programs runs on a desktop or server OS where you have lots of perf breathing room. That’s my experience. And that’s frankly your experience too, if you use Linux, Windows, or Apple’s OSes

      > how would trying to run a JIT like V8 inside Fil-C go?

      You’d get a Fil-C panic. Fil-C wouldn’t allow you to PROT_EXEC lol

      11 replies →

  • > Some programs have a ~4x slowdown

    How does it compare to something like RLBox?

    > This is a myth. 99% of the C/C++ code you are using right now is not perf sensitive.

    I don't think that's true, or at best its a very contorted definition of "perf sensitive". Most code is performance sensitive in my opinion - even shitty code written in Python or Ruby. I would like it to be faster. Take Asciidoctor for example. Is that "performance sensitive"? Hell yes!

    • > How does it compare to something like RLBox?

      I don’t know and it doesn’t matter because RLBox doesn’t make your C code memory safe. It only containerizes it.

      Like, if you put a database in RLBox then a hacker could still use a memory safety bug to corrupt or exfiltrate sensitive data.

      If you put a browser engine in RLBox then a hacker could still pwn your whole machine:

      - If your engine has no other sandbox other than RLBox then they’d probably own your kernel by corrupting a buffer in memory that is being passed to a GPU driver (or something along those lines). RLBox will allow that corruption because the buffer is indeed in the program’s memory.

      - If the engine has some other sandbox on top of RLbox then the bad guys will corrupt a buffer used for sending messages to brokers, so they can then pop those brokers. Just as they would without RLbox.

      Fil-C prevents all of that because it uses a pointer capability model and enforces it rigorously.

      So, RLbox could be infinity faster than Fil-C and I still wouldn’t care

      5 replies →

  • All code is perf-sensitive.

    Also, literally every language claims "only a x2 slowdown compared to C".

    We still end up coding in C++, because see the first point.

    • I’m not claiming only 2x slowdown. It’s 4x for some of the programs I’ve measured. 4x > 2x. I’m not here to exaggerate the perf of Fil-C. I actually think that figuring out the true perf cost is super interesting!

      > All code is perf-sensitive.

      That can’t possibly be true. Meta runs on PHP/Hack, which are ridiculously slow. Code running in your browser is JS, which is like 40x slower than Yolo-C++ and yet it’s fine. So many other examples of folks running code that is just hella slow, way slower than “4x slower than C”

      6 replies →

    • > All code is perf-sensitive.

      I'm doing some for loops in bash right now that could use 1000x more CPU cycles without me noticing.

      Many programs use negligible cycles over their entire runtime. And even for programs that spend a lot of CPU and need tons of optimizations in certain spots, most of their code barely ever runs.

      > Also, literally every language claims "only a x2 slowdown compared to C".

      I've never seen anyone claim that a language like python (using the normal implementation) is generally within the same speed band as C.

      The benchmark game is an extremely rough measure but you can pretty much split languages into two groups: 1x-5x slowdown versus C, and 50x-200x slowdown versus C. Plenty of popular languages are in each group.

      2 replies →

  • There’s loads of good shells and text editors written in other languages.

    I’m the author of a shell written in Go and it’s more capable than Zsh.

  • Very good answer and agree. There seems to be a psychological component wrapped up in the mythology versus the reality of what's necessary.

  • > Great examples are things like shells and text editors

    I regularly get annoyed that my shell and text editor is slow :-)

    I do agree on principle, though.

  • Super cool. What is your goal wrt performance? Is low 1.x-ish on average attainable, in your opinion?

    • I think that worst case 2x, average case 1.5x is attainable.

      - Code that uses SIMD or that is mostly dealing with primitive data in large arrays will get to close to 1x

      - Code that walks trees and graphs, like interpreted or compilers do, might end up north of 2x unless I am very successful at implementing all of the optimizations I am envisioning.

      - Code that is IO bound or interactive is already close to 1x