← Back to context

Comment by thesz

20 days ago

  > I think the missing bit here is that this only works for cases where there's a really large test set (the html spec, the linux kernel). I'm not convinced that the models would be able to maintain coherence without this, so maybe that's what we need to figure out how to build to make this actually works.

Take any language with compiler and several thousands of users and you have a plenty of tests that approximate spec inward and outward.

Here's, for example, VHDL tests suite for GHDL, open source VHDL compiler and simulator: https://github.com/ghdl/ghdl/tree/master/testsuite

The GHDL test suite is sufficient and general enough to develop a pretty capable clone, to my knowledge. To my knowledge, there is only one open source VHDL compiler and it is written in Ada. And, again, expertise to implement another one from scratch to train an LLM on it is very, very scarce - VHDL, being highly parallel variant of Ada, is quirky as hell.

So someone can test your hypothesis on the VHDL - agent-code a VHDL compiler and simulator in Rust so that it passes GHDL test suite. Would it take two weeks and $20,000 as with C? I don't know but I really doubt so.