← Back to context

Comment by ACCount37

1 day ago

Your mistake is thinking that the user wants an algorithm that solves Wordles efficiently. Or that making and invoking a tool is always a more efficient solution.

As opposed to: the user is a 9 year old girl, and she has this puzzle in a smartphone game, and she can't figure out the answer, and the mom is busy, so she asks the AI, because the AI is never busy.

Now, for a single vaguely Wordle-like puzzle, how many tokens would it take to write and invoke a solver, and how many to just solve it - working around the tokenizer if necessary?

If you had a batch of 9000 puzzle questions, I can easily believe that writing and running a purpose specific solver would be more compute efficient. But if we're dealing with 1 puzzle question, and we're already invoking an LLM to interpret the natural language instructions for it? Nah.

> Your mistake is thinking that the user wants an algorithm that solves Wordles efficiently. Or that making and invoking a tool is always a more efficient solution.

Weird how you tell that user is not worried about solving the problem efficiently so we might just as well use LLM directly for it, and go to saying how creating a tool might not be efficient either..

And as we know, LLMs are now very good at character-level problems, but are relatively good at making programs; in particular ones for problems we already know of. LLMs might be able to solve Wordles today with straight-up guessing by just adding spaces between the letters and using their very wide vocabulary, but can LLMs solve e.g. word search puzzles at all?

As you say, if there are 9000 puzzle questions, then a solver is a natural choice to due compute efficiency. But it will also answer the question, and do it without errors (here I'm overstating LLM's abilities a bit though; this would certainly not hold true to novel problems). No "Oh what sharp eyes you have! I'll address the error immdiately!" responses from the solver are to be expected, and actually unsolvable puzzles will be identified, not "lied" about. So why not use the solver even for a single instance of the problem?

I think the (training) effort would be much better on teaching LLMs when they should use an algorithm and when they should just use the model. Many use cases are much less complicated and even more easily solved algorithmically than word puzzle solvers as well; they might be e.g. sorting lists by a certain criteria (the list may be augmented by LLM-created additional data first), and for this task as well I'd rather use a deterministic algorithm than one driven by neural networks and randomness.

E.g. Gemini, Mistral and ChatGPT can do this already in some cases: if I ask them to "Calculate sum of primes between 0 and one million.", it looks like all of them created a piece of code to calculate it. Which is exactly what they should do. (The result was correct.)