Notes by djb on using Fil-C

3 months ago (cr.yp.to)

262 comments

transpute

To summarize, he's sufficiently impressed with it that he's embarking on an attempt to rebuild an entire Debian system with it, and he's written some software (a GC shim library and build scripts) that are likely to be of interest to others who are attempting the same thing.

le-mark 3 months ago

> I had originally configured the server phoenix with only 12GB swap. I then had to restart ./build_all_fast_glibc.sh a few times because the Fil-C compilation ran out of memory. Switching to 36GB swap made everything work with no restarts; monitoring showed that almost 19GB swap (plus 12GB RAM) was used at one point. A larger server, 128 cores with 512GB RAM, took 8 minutes for Fil-C plus 6 minutes for musl, with no restarts needed.

Yikes that’s a lot of memory! Filc is doing a lot of static analysis apparently.

mbrock 3 months ago
I think that's the build of LLVM+Clang itself.
- collinfunk 3 months ago
  
  Yes, linking LLVM takes up a lot of memory. The documented guidance is to allow one link job per 15 GB of RAM [1].
  [1] https://llvm.org/docs/CMake.html#frequently-used-llvm-relate...
  
  3 replies →

1vuio0pswjnm7 3 months ago

For those who might miss it, the notes cite a new 64-bit version of cdb that supports exabyte databases

https://cdb.cr.yp.to

Also maybe of interest is that the new cdb subdomain is using pqconnect instead of dnscurve

Panino 3 months ago
> Also maybe of interest is that the new cdb subdomain is using pqconnect instead of dnscurve
This is not correct. There isn't a cdb subdomain because cdb.cr.yp.to doesn't have NS records, which is where DNSCurve fits in. If you have a DNSCurve resolver, then your queries for cdb.cr.yp.to will use DNSCurve and will be sent to the yp.to nameservers.
From there, if you have pqconnect, your http(s) connection to cdb.cr.yp.to will happen over pqconnect.
Maybe the confusion is because both DNSCurve and pqconnect encode pubkeys in DNS, but they do different things.
Here is DNSCurve:
$ dig +short ns yp.to uz5jmyqz3gz2bhnuzg0rr0cml9u8pntyhn2jhtqn04yt3sm5h235c1.yp.to.
Here is pqconnect:
$ dig +short cdb.cr.yp.to pq1htvv9k4wkfcmpx6rufjlt1qrr4mnv0dzygx5mlrjdfsxczbnzun055g15fg1.yp.to. 131.193.32.108
Like CurveCP, pqconnect puts the pubkey into a CNAME.
1vuio0pswjnm7 3 months ago
RFC 1034 Domain Concepts and Facilities November 1987 [Page 8]
"A domain is identified by a domain name, and consists of that part of the domain name space that is at or below the domain name which specifies the domain. A domain is a subdomain of another domain if it is contained within that domain. This relationship can be tested by seeing if the subdomain's name ends with the containing domain's name. For example, A.B.C.D is a subdomain of B.C.D, C.D, D, and " "."
1 cdb.cr.yp.to - regular DNS: 124 bytes, 1+2+0+0 records, response, noerror query: 1 cdb.cr.yp.to answer: cdb.cr.yp.to 30 CNAME pq1jbw2qzb2201xj6pyx177b8frqltf7t4wdpp32fhk0w3h70uytq5020w020l0.yp.to answer: pq1jbw2qzb2201xj6pyx177b8frqltf7t4wdpp32fhk0w3h70uytq5020w020l0.yp.to 30 A 131.193.32.109
In the terminology of RFC1034, cdb.cr.yp.to, a CNAME, can be described as a subdomain of cr.yp.to and yp.to
(NB. The pq1 portion is not a public key, it is a hash of a server's long-term public key)
- 1vuio0pswjnm7 3 months ago
  
  Correction: s/a CNAME/an alias/

1vuio0pswjnm7 3 months ago

The PQConnect documentation, specifically the document "INSTALL.md", describes the pq1 portion of the CNAME as a subdomain.

   Please update your DNS A/AAAA records for all domains on this server as follows:

   Existing record:
   Type    Name        Value
   A/AAAA  SUBDOMAIN   IP Address

   New Records:
   Type    Name        Value
   CNAME   SUBDOMAIN   pq1XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX.DOMAIN.TLD
   A/AAAA  pq1XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX  IP Address
   TXT    pq1XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX.DOMAIN.TLD    p=42424
   TXT    ks.pq1XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX.DOMAIN.TLD    ip=IP ADDRESS;p=42425"

1vuio0pswjnm7 3 months ago
Use of pqconnect at yp.to is probably old news but the cdb.cr.yp.to CNAME does appear to be new as of around 21 Oct
The notes on using Fil-C were submitted three days ago
https://news.ycombinator.com/item?id=45765718
- 1vuio0pswjnm7 3 months ago
  
  s/CNAME/alias/
loeg 3 months ago

https://news.ycombinator.com/item?id=45663435 (discussed 11d ago)

lquidfire 3 months ago

Cool project! I take it the goal is that, overhead being acceptable, most C / C++ programmes don't actually "have to be" rewritten in something like Rust?

I wonder how / where Epic Games comes in?

pornel 3 months ago
Note that Fil-C is a garbage-collected language that is significantly slower than C.
It's not a target for writing new code (you'd be better off with C# or golang), but something like sandboxing with WASM, except that Fil-C crashes more precisely.
- thesz 3 months ago
  
  From the topic starter: "I've posted a graph showing nearly 9000 microbenchmarks of Fil-C vs. clang on cryptographic software (each run pinned to 1 core on the same Zen 4). Typically code compiled with Fil-C takes between 1x and 4x as many cycles as the same code compiled with clang"
  Thus, Fil-C compiled code is 1 to 4 times as slow as plain C. This is not in the "significantly slower" ballpark, like where most interpreters are. The ROOT C/C++ interpreter is 20+ times slower than binary code, for example.
  
  19 replies →
- quotemstr 3 months ago
  
  WASM is a sandbox. It doesn't obviate memory safety measures elsewhere. A program with a buffer overflow running in WASM can still be exploited to do anything that program can do within in WASM sandbox, e.g. disclose information it shouldn't. WASM ensures such a program can't escape its container, but memory safety bugs within a container can still be plenty harmful.
  
  2 replies →
- galangalalgol 3 months ago
  
  What language do people considering c as an option for a new project consider? Rust is the obvious one we aren't going to discuss because then we won't be able to talk about anything else, Zig is probably almost as well loved and defended, but it isn't actually memory safe, just much easier to be memory safe. As you say, c# and go, also maybe f# and ocaml if we are just writing simple c style stuff none of those would look all that different. Go jhs some ub related to concurrency that people run into, but most of these simple utilities are either single threaded or fine grained parallel which is pretty easy to get right. Julia too maybe?
  
  4 replies →
- vacuity 3 months ago
  
  A GC lang isn't necessarily significantly slower than C. You should qualify your statements. Moreover, this is a variant of C, which means that the programs are likely less liberal with heap allocations. It remains to be seen how much of a slowdown Fil-C imposes under normal operating conditions. Moreover, although it is indeed primarily suited for existing programs, its use in new programs isn't necessarily worse than, e.g., C# or Go. If performance is the deciding factor, probably use Rust, Zig, Nim, D, etc. .
- fithisux 3 months ago
  
  Test with Fil-C, compile with gcc into production. Easy.
ibejoeb 3 months ago

Filip of Fil-C is at Epic. Epic owns the copyright.

testdelacc1 3 months ago

For those, like me, that didn’t know what Fil-C is:

> Fil-C is a fanatically compatible memory-safe implementation of C and C++. Lots of software compiles and runs with Fil-C with zero or minimal changes. All memory safety errors are caught as Fil-C panics. Fil-C achieves this using a combination of concurrent garbage collection and invisible capabilities (InvisiCaps). Every possibly-unsafe C and C++ operation is checked. Fil-C has no unsafe statement and only limited FFI to unsafe code.

https://fil-c.org/

The posted article has a detailed explanation of djb successfully compiling a bunch of C and C++ codebases.

commandersaki 3 months ago
I guess to get on board with this, it is my understanding you have to accept the premise of a Garbage Collector in the runtime?
- mbrock 3 months ago
  
  Note that it is a garbage collector designed and implemented by one of the most experienced GC experts on earth. He previously designed and implemented WebKit's state of the art concurrent GC, for example. So—yes, but don't dismiss it too quickly.
  
  83 replies →
- thomasmg 3 months ago
  
  The author of Fil-C does have some ideas to avoid a garbage collector [1], in summary: Use-after-free at worst means you might see an object of the same size, but you can not corrupt data structures (no pointer / integer confusion). This would be more secure than standard C, but less secure than Fil-C with GC.
  [1] https://x.com/filpizlo/status/1917410045320650839
- kragen 3 months ago
  
  So far we haven't found a viable alternative; CHERI has holes in its temporal integrity guarantees.
  
  5 replies →

HexDecOctBin 3 months ago

Can a program be written only partially in Fil-C? That is to say, can we link regular C and Fil+C object files in a single executable?

jitl 3 months ago
> There is no interoperability with Yolo-C (i.e. classic C). This is both a goal and the outcome of a non goal.
https://fil-c.org/runtime
(worth reading, i think all the stuff Fil writes is both super informative & quite entertaining.)
- HexDecOctBin 3 months ago
  
  This is disappointing. I can write the networking parts in Rust and the rest of the program in C, but apparently can't do the same with Fil-C.
  
  1 reply →

fngjdflmdflg 3 months ago

Is there a reason that some of the linked benchmarks, if I'm reading it right, have Fil-C running faster than C?[0] I assume it's just due to micro-benchmark variability but I'm curious. Some of them seem impossibly fast compared to C so I wonder if there are some correctness issue there.

[0] https://cr.yp.to/2025/20251028-filcc-vs-clang.html

jeffjeffbear 3 months ago
Usually garbage collection does improve alot of benchmarks, just look at the hans boem gc benchmarks.
- fragmede 3 months ago
  
  Back in the day, the cheat was to set up the GC so that the GC happened outside the timed portion of the benchmark. You know what's faster than the fastest GC? Not doing it.
  
  1 reply →
- fngjdflmdflg 3 months ago
  
  The two extreme outliers I see are labeled "aead/clx192q/opt,-O3" and "aead/schwaemm128128v2/opt,-Os" according to clicking on the points with devtools. aead/schwaemm128128v2/opt,-Os looks like it is almost at 0x. 1x is at about y = 659 and that test is at 769 out of I guess 780 based on the graph.

dang 3 months ago

Fil-C: A memory-safe C implementation - https://news.ycombinator.com/item?id=39542944 (Feb 2024)

transpute 3 months ago
Thanks for the subthread discussion links, e.g. authors of LuaJIT and Fil-C, https://news.ycombinator.com/item?id=40556083 (June 2024)
- jitl 3 months ago
  
  Mike Pall is the author of LuaJIT.
  
  1 reply →

sheepscreek 3 months ago

I’m glad Phil’s work is finally getting the recognition it deserves.

There may be useful takeaways here for Rust’s “unsafe” mode - particularly for applications willing to accept the extra burden of statically linking Fil-C-compiled dependencies. Best of both worlds!

gpm 3 months ago
> particularly for applications willing to accept the extra burden of statically linking Fil-C-compiled dependencies. Best of both worlds!
As near as I can tell Fil-C doesn't support this, or any other sort of FFI, at all. Nor am I sure FFI would even make sense, it seems like an approach that has to take over the entire program so that it can track pointer provenance.
- hedgehog 3 months ago
  
  For securing and maintaining a complex legacy application it seems like a reasonable approach would be to move the majority into Fil-C, then hook the bits that don't fit up via RPC. Maybe some bits get formal verification, rewritten in Rust, ported to new platform APIs, whatever, but at least you get some safety for the whole app without a rewrite.
- quotemstr 3 months ago
  
  He could add an API to mint a capability out of thin air. It could even be done out of process.
  In fact, I think Fil-C and CHERI could implement 90% the same programmer-level API!

erichocean 3 months ago

I would really like to see Omarchy go this direction. A fully memory-safe userland for Omarchy is possible with existing techhnology.

timeon 3 months ago
Can you elaborate why Omarchy? I'm asking, in context of recompiling with Fil-C, because that seems to be just Arch + configurations.
- erichocean 3 months ago
  
  For cultural reasons, I would like Omarchy—culturally—to adopt straightforward security as one of their goals, in addition to usability and beauty.
  It's low hanging fruit, and a great way to further differentiate their Linux distribution.

gkfasdfasdf 3 months ago

Does Fil-C catch uninitialized memory reads?

jitl 3 months ago

malloc'd memory is zeroed in fil-c:
> *zgc_alloc*
> Allocate count bytes of zero-initialized memory. May allocate slightly more than count, based on the runtime's minalign (which is currently 16).
> This is a GC allocation, so freeing it is optional. Also, if you free it and then use it, your program is guaranteed to panic.
> libc's malloc just forwards to this. There is no difference between calling malloc and zgc_alloc.
from https://fil-c.org/stdfil

Slothrop99 3 months ago

Great to see some 3letter guy into this. This might be one of those rando things which gets posted on HN (and which doesn't involve me in the slightest), but a decade later is taking over the world. Rust and Go were like that.

Previously there was that Rust in APT discussion. A lot of this middle-aged linux infrastructure stuff is considered feature-complete and "done". Not many young people are coming in, so you either attract them with "heyy rewrite in rust" or maybe the best thing is to bottle it up and run in a VM.

mesrik 3 months ago
>Great to see some 3letter guy into this
AFAIK, djb isn't for many "some 3letter guy" for over about thirty years but perhaps it's just age related issue with those less been around.
https://en.wikipedia.org/wiki/Daniel_J._Bernstein
- Slothrop99 3 months ago
  
  Just to be clear, I mean to venerate Bernstein for earning his 3letters, not to trivialize him.
  
  17 replies →
- pixelpoet 3 months ago
  
  It's wild how much he looks like ryg, another 3 letter genius

quotemstr 3 months ago

I can't wait for all the delicious four-way flamewars. Choose your fighter!

1) Rewrite X in Rust

2) Recompile X using Fil-C

3) Recompile X for WASM

4) Safety is for babies

There are a lot of half baked Rust rewrites whose existence was justified on safety grounds and whose rationale is threatened now that HN has heard of Fil-C

Klonoar 3 months ago
Fil-C has come up on HN plenty of times before. If it was going to make much of a dent in the discussions, it would have by now.
- quotemstr 3 months ago
  
  It's strange how ideas seem to explode at random into the discourse despite being known for a long time. It's as if some critical mass stumbles on a thing and it becomes "the current thing" everyone talks about until the next current thing.
- jitl 3 months ago
  
  odd fallacy. things grow in popularity / awareness over time
ddalex 3 months ago

I'm on camp 2.
dev_l1x_be 3 months ago

We have a saying that jam is made of fruit that gave up the fight becoming a brandy.
Rebelgecko 3 months ago
Obviously someone needs to rewrite Rust in Fil-C
- pizlonator 3 months ago
  
  Yeah since Fil-C is just an LLVM transform we could make Rust memory safe with it
int_19h 3 months ago
It's not an either-or (well, except for this last item).
It seems sensible to not write new software in plain C. Rust is certainly a valid choice for a safer language, but in many cases overkill wrt how painful the rewrite is vs benefits gained from avoiding a higher-level memory-safe one like OCaml.
At the same time, "let's just rewrite everything!" is also madness. We have many battle-tested libraries written in C already. Something like Fil-C is badly needed to keep them working while improving safety.
And as for wasm, it's sort of orthogonal - whether you're writing in C or in Rust, the software may be bug-free, but sandboxing it may still be desirable e.g. as a matter of trust (or lack thereof). Also, cross-platform binaries would be nice to have in general.
- vacuity 3 months ago
  
  > the software may be bug-free, but sandboxing it may still be desirable e.g. as a matter of trust (or lack thereof)
  Wouldn't the only cause of mistrust be bugs, or am I missing something? If the program is malicious, sandboxing isn't the pertinent action.
  
  2 replies →

fjfaase 3 months ago

I am a bit surprised that the build_all_fast_glibc.sh script requires 31Gbyte of memory to run. Can somebody explain? I would like to try out Fil-C.

ComputerGuru 3 months ago

Building and linking llvm sucks.

scandox 3 months ago

Interesting to see some bash curl being used by a renowned cryptologist...

IshKebab 3 months ago
Almost like it's actually fine.
https://medium.com/@ewindisch/curl-bash-a-victimless-crime-d...
- uecker 3 months ago
  
  It is definitely not fine. The argument seems to be that since you need to trust somebody, curl | bash is fine because you just trust whoever controls the webserver. I think this is missing the point.
  
  15 replies →
- oguz-ismail 3 months ago
  
  [flagged]

jeffrallen 3 months ago

Wish we were talking about making Fil-C required for apt, not Rust...

phicoh 3 months ago
Those seems to be independent issues. Fil-C is about the best way to compile/run C code.
Rust would be about what language to use for new code.
Now that I have been programming in Rust for a couple of years, I don't want to go back to C (except for some hobby projects).
- thomasmg 3 months ago
  
  I agree. The main advantage of Fil-C is compatibility with C, in a secure way. The disadvantages are speed, and garbage collection. (Even thought, I read that garbage collection might not be needed in some cases; I would be very interested in knowing more details).
  For new code, I would not use Fil-C. For kernel and low-level tools, other languages seem better. Right now, Rust is the only popular language in this space that doesn't have these disadvantages. But in my view, Rust also has issues, specially the borrow checker, and code verbosity. Maybe in the future there will be a language that resolves these issues as well (as a hobby, I'm trying to build such a language). But right now, Rust seems to be the best choice for the kernel (for code that needs to be fast and secure).
  
  4 replies →
dontlaugh 3 months ago
Fil-C is slow.
There is no C or C++ memory safe compiler with acceptable performance for kernels, rendering, games, etc. For that you need Rust.
The future includes Fil-C for legacy code that isn’t performance sensitive and Rust for new code that is.
- drnick1 3 months ago
  
  No, Rust is awful for game development. It's not really what it was intended for. For one, all the graphics API are in C, so you would have to use unsafe FFI basically everywhere.
- sibellavia 3 months ago
  
  How slow? In some contexts, the trade-off might be acceptable. From what I've seen in pizlonator's tweets, in some cases the difference in speed didn't seem drastic to me.
  
  3 replies →
- Rebelgecko 3 months ago
  
  I imagine Apt is usually IO constrained?
  
  1 reply →
- mbrock 3 months ago
  
  What does that have to do with apt?
  
  10 replies →
oddmiral 3 months ago
I wish, we will have something like Fil-C as an option for unsafe Rust.
- arthur2e5 3 months ago
  
  Fil-C works because you recompile the whole C userspace. Unsafe Rust doesn't do that... and for many practical purposes you probably want to touch the non-safe-version of the C userspace.
  Still, it's all LLVM, so perhaps unsafe Rust for Fil-space can be a thing, a useful one for catching (what would be) UBs even [Fil-C defines everything, so no UBs, but I'm assuming you want to eventually run it outside of Fil-space].
  Now I actually wonder if Fil-C has an escape hatch somewhere for syscalls that it does not understand etc. Well it doesn't do inline assembly, so I shouldn't expect much... I wonder how far one needs to extend the asm clobber syntax for it to remotely come close to working.
  
  1 reply →
- simonask 3 months ago
  
  Unsafe Rust actually has a great runtime analyzer: Miri. It's very easy to just run `cargo +nightly miri test` in your project to get some confidence in the more questionable choices along the way.
lucyjojo 3 months ago

doesnt it only work on x86_64?

nitinreddy88 3 months ago

Building tools is one thing, building a system like Postgres or Databases is going to be another thing.

Anyone really tried building PG or MySQL or such a complex system which heavily relies on IO operations and multi threading capabilities

mbrock 3 months ago
Look at how fanatic the compatibility actually is. Building Postgres or MySQL is conceivable but probably will require some changes. (SQLite compiles and runs with zero changes right now.)
- SQLite 3 months ago
  
  SQLite runs about 5 times faster compiled with GCC (13.3.0) than it does when compiled with FIL-C. And the resulting compiled binary from GCC is 13 times smaller.
  
  1 reply →
- kragen 3 months ago
  
  Thanks for checking! I was wondering.
  
  2 replies →

stevefan1999 3 months ago

djb uses a surprisingly low amount of RAM (12GB) considering my laptop already has 64G which is possible to expand to 128G in the future

blessinghit 3 months ago

[dead]

twic 3 months ago

> Debian using Fil-C (Filian?)

DJB SMACKER CONFIRMED?!