Comment by nemo1618
8 hours ago
This strikes me as a very agent-friendly problem. Given a harness that enforces sufficiently-rigorous tests, I'm sure you could spin up an agent loop that methodically churns through these functions one by one, finishing in a few days.
hallucinations in a libc implementation would be especially bad
Have you ever used an LLM with Zig? It will generate syntactically invalid code. Zig breaks so often and LLMs have such an eternally old knowledge cutoff that they only know old ass broken versions.
The same goes for TLA+ and all the other obscure things people think would be great to use with LLMs, and they would, if there was as much training data as there was for JavaScript and Python.
i find claude does quite well with zig. this project is like > 95% claude, and it's an incredibly complicated codebase [0] (which is why i am not doing it by hand):
https://github.com/ityonemo/clr
[0] generates a dynamically loaded library which does sketchy shit to access the binary representation of datastructures in the zig compiler, and then transpiles the IR to zig code which has to be rerun to do the analysis.
To be fair, this was true of early public LLMs with rust code too. As more public zig repositories (and blogs / docs / videos) come online, they will improve. I agree it's a mess currently.
You must have not tried this with an LLM agent in the past few months.
i tested sonnet 4.5 just last week on a zig codebase and it has to be instructed the std.ArrayList syntax every time.
2 replies →