Comment by JohnMakin
5 months ago
Would really like to see this float back to the front page rather than getting buried 4+ deep despite its number of upvotes - this is very significant and very damning, and this guy is a real big figure apparently in the AI "hype" space (as far as I understand - that stuff actually hurts my brain to read so I avoid it like the plague).
Evidence I find damning that people have posted:
- Filtering out of "claude" from output responses - would frequently be a blank string, suggesting some manipulation behind the scenes
- errors in output caused by passing in <CLAUDE> tags in clever ways which the real model will refuse to parse (passed in via base64 encoded string)
- model admitting in various ways that it is claude/built by anthropic (I find this evidence less pursuasive, as models are well known to lie or be manipulated into lying)
- Most damning to me, when people were still playing with it, they were able to get the underlying model to answer questions in arabic, which was not supported on the llama version it was allegedly trained on (ZOMG, emergent behavior?)
Feel free to update this list - I think this deserves far more attention than it is getting.
He has now basically come to some sort of half assed apology on twitter/x:
https://x.com/mattshumer_/status/1833619390098510039?s=46
the “full explanation” - https://x.com/csahil28/status/1833619624589725762?s=46
adding - tokenizer output test showed consistency with claude, this test is allegedly no longer working