← Back to context

Comment by tredre3

2 days ago

In the last paragraph you handwave that all the Z80 and ZX Spectrum documentations is likely already in the model anyway... Choosing to not provide the documents/websites might then requiring more prompting to finish the emulator, but the knowledge is there. You can't clean room with a large LLM. That's delusion!

Counterpoint: in December, a Polish MP [0] has vibe-coded an interpreter [1] of a 1959 Polish programming language, feeding it the available documentation. _That,_ at least, is unlikely to have appeared in the model’s training data.

[0]: https://en.wikipedia.org/wiki/Adrian_Zandberg [1]: https://sako-zam41.netlify.app/

  • Not exactly a counterpoint, since nobody argued that LLMs can not produce "original" code from specs at all - just that this particular exercise was not clean room.

    (although for SAKO [1], it's an average 1960 programming language, just with keywords in Polish, so it's certainly almost trivial for an LLM to produce an interpreter, since construction via analogy is the bread and butter of LLMs. Also, such interpreters tend to have an order of magnitude less complexity than emulators.)

    [1]: https://en.wikipedia.org/wiki/SAKO_(programming_language)