← Back to context

Comment by hansmayer

8 hours ago

Not trying to be cynical here, but I am genuinely interested is there a reason why these LLM don't/can't/won't apply some deterministic algorithm? I mean, counting characters and such, we have solved those problems ages ago.

They can. ChatGPT has been able to count characters/words etc flawlessly for a couple of years now if you tell it to "use your Python tool".

  • Fair enough. But why do I have to tell them that, should they not be able to figure it out themselves? If I show a 5-year kid once how to use colour pencils, I won't have to show them each time they want to make a drawing. This is the core weakness of the LLMs - you have to micromanage them so much, that it runs counter to the core promise that is being pushed since 3+ years now.

    • Specifically for simple character level questions, if LLMs did that automatically, we would be inundated with stories about "AI model caught cheating"

      They are stuck in a place where the models are expected to do two things simultaneously. People want them to show the peak of pure AI ability while at the same time be the most useful they can be.

      Err too much on the side automatic use of tools and people will claim you're just faking it, fail to use tools sufficiently and people will claim that the AI is incapable of operations that any regular algorithm could do.

      4 replies →

    • If you care enough about this you can stick a note in your own custom instructions about it.

      If you allow ChatGPT to use its memory feature (I deliberately turn that off) and ask those kinds of questions enough it might even make a note about this itself.

      2 replies →

I think the intuition is that they don’t ‘know’ that they are bad at counting characters and such, so they answer the same way they answer most questions.

  • Well, they can be made to use custom tools for writing to files and such, so I am not sure if that is the real reason? I have a feeling it is more because of trying to make this an "everything technology".