Comment by richardw

1 year ago

LLM’s are a compressed and lossy form of our combined writing output, which it turns out is similarly structured enough to make new combinations of text seem reasonable, even enough to display simple reasoning. I find it useful to think “what can I expect from speaking with the dataset of combined writing of people”, rather than treating a basic LLM as a mind.

That doesn’t mean we won’t end up approximating one eventually, but it’s going to take a lot of real human thinking first. For example, ChatGPT writes code to solve some questions rather than reasoning about it from text. The LLM is not doing the heavy lifting in that case.

Give it (some) 3D questions or anything where there isn’t massive textual datasets and you often need to break out to specialised code.

Another thought I find useful is that it considers its job done when it’s produced enough reasonable tokens, not when it’s actually solved a problem. You and I would continue to ponder the edge cases. It’s just happy if there are 1000 tokens that look approximately like its dataset. Agents make that a bit smarter but they’re still limited by the goal of being happy when each has produced the required token quota, missing eg implications that we’d see instantly. Obviously we’re smart enough to keep filling those gaps.

"I find it useful to think “what can I expect from speaking with the dataset of combined writing of people”, rather than treating a basic LLM as a mind."

I've been doing this as well, mentally I think of LLMs as the librarians of the internet.

  • They're bad librarians. They're not bad, they do a bad job of being librarians, which is a good thing! They can't quite tell you the exact quote, but they do recall the gist, they're not sure it was Gandhi who said that thing but they think he did, it might be in this post or perhaps one of these. They'll point you to the right section of the library to find what you're after, but make sure you verify it!