← Back to context

Comment by jaen

1 day ago

I mean, for an article that's titled "clean room", that would be the first thing to do, not as a "maybe follow up in the future"...

(I do think the article could have stood on its own without mentioning anything about "clean room", which is a very high standard.)

For the handwavy point about the x86 assembler, I am quite sure that the LLM will remember the entirety of the x86 instruction set without any reference, it's more of a problem of having a very well-tuned agentic loop with no context pollution to extract it. (which you won't get by YOLOing Claude, because LLMs aren't that meta-RLed yet to be able to correct their own context/prompt-engineering problems)

Or alternatively, to exploit context pollution, take half of an open-source project and let the LLM fill in the rest (try to imagine the synthetic "prompt" it was given when training on this repo) and see how far it is from the actual version.