← Back to context

Comment by Terr_

7 hours ago

> to see LLM re-discover

I imagine someone probably wrote very specifically about it in the training data that underwent lossy compression, and the LLM is decompressing that how-to.

So I'd say it's more like "surfacing" or "retrieving" than "re-discovering".

They scraped everything on Stackoverflow, likely IRC logs from Freenode, and every book written in the modern era courtesy of Sci-Hub / Library Genesis / Anna's Archive / Z Library.

RIP Aaron Swartz, they're generating trillions in shareholder value from the spiritual successors to the work they were going to imprison you for.