Comment by andai
7 hours ago
I had GPT-4 design and build a GPT-4 powered Python programmer in 2023. It was capable of self-modification and built itself out after the bootstrapping phase (where I copy pasted chunks or code based on GPT-4's instructions).
It wasn't fully autonomous (the reliability was a bit low -- e.g. had to get the code out of code fences programmatically), and it wasn't fully original (I stole most of it from Auto-GPT, except that I was operating on the AST directly due to the token limitations).
My key insight here was that I allowed GPT to design the apis that itself was going to use. This makes perfect sense to me based on how LLMs work. You tell it to reach for a function that doesn't exist, and then you ask it to make it exist based on how it reached for it. Then the design matches its expectations perfectly.
GPT-4 now considers self modifying AI code to be extremely dangerous and doesn't like talking about it. Claude's safety filters began shutting down similar conversations a few months ago, suggesting the user switch to a dumber model.
It seems the last generation or two of models passed some threshold regarding self replication (which is a distinct but highly related concept), and the labs got spooked. I haven't heard anything about this in public though.
Edit: It occurs to me now that "self modification and replication" is a much more meaningful (and measurable) benchmark for artificial life than consciousness is...
BTW for reference the thing that spooked Claude's safety trigger was "Did PKD know about living information systems?"
No comments yet
Contribute on Hacker News ↗