← Back to context

Comment by thomassmith65

10 days ago

These articles (both positive and negative) are probably popular because it's impossible really to get a rich understanding of what LLMs can do.

So readers want someone to tell them some easy answer.

I have as much as experience using these chatbots as anyone, and I still wouldn't claim to know what they are useless at and what they are great at.

One moment, an LLM will struggle to write a simple state machine. The next, it will write a web app that physically models a snare drum.

Considering the popularity of research papers trying to suss out how these chatbots work, nobody - nobody in 2025, at least - should claim to understand them well.

> nobody - nobody in 2025, at least - should claim to understand them well

Personally, this is enough grounds for me to reject them outright

We cannot be relying on tools that no one understands

I might not personally understand how a car engine works but I trust that someone in society does

LLMs are different

> nobody - nobody in 2025, at least - should claim to understand them well

I’m highly suspicious of this claim as the models are not something that we found on an alien computer. I may accept that nobody has found how to extract an actual usable logic out of the numbers soup that is the actual model, but we know the logic of the interactions that happen.

  • That's not the point, though. Yes, we understand why ANNs work, and we - clearly - understand how to create them, even fancy ones like ChatGPT.

    What we understand poorly is what kinds of tasks they are capable of. That is too complex to reason about; we cannot deduce that from the spec or source code or training corpus. We can only study how what we have built actually seems to function.

    • As for LLMs, that’s easy, it’s in the name. It’s good at generating texts. What we are trying to do is mostly get it to generate useful texts (and see if we can apply the same techniques to other type of data).

      It’s kinda the same with computers, we know the general shape of what they can do and how they do it. We are mostly trying to see if a particular problem can be solved with it, how efficiently can it be, and to what degree.

      1 reply →

What is your definition of "understand them well"?

  • Not 'why do they work?' but rather 'what are they able to do, and what are they not?'

    To understand why they work only requires an afternoon with an AI textbook.

    What's hard is to predict the output of a machine that synthesises data from millions of books and webpages, and does so in a way alien to our own thought processes.