Comment by avaer

12 hours ago

That was my first thought too -- instead of talk like a caveman you could turn off reasoning, with probably better results.

Additionally, LLMs do not actually operate in text; much of the thinking happens in a much higher dimensional space that just happens to be decoded as text.

So unless the LLM was trained otherwise, making it talk like a caveman is more than just theoretically turning it into a caveman.

25 comments

avaer

DrewADesign 12 hours ago

> much of the thinking happens in a much higher dimensional space that just happens to be decoded as text.

What do you mean by that? It’s literally text prediction, isn’t it?

K0balt 9 hours ago
It is text prediction. But to predict text, other things follow that need to be calculated. If you can step back just a minute, i can provide a very simple but adjacent idea that might help to intuit the complexity of “ text prediction “ .
I have a list of numbers, 0 to9, and the + , = operators. I will train my model on this dataset, except the model won’t get the list, they will get a bunch of addition problems. A lot. But every addition problem possible inside that space will not be represented, not by a long shot, and neither will every number. but still, the model will be able to solve any math problem you can form with those symbols.
It’s just predicting symbols, but to do so it had to internalize the concepts.
- qsera 6 hours ago
  
  >internalize the concepts.
  This gives the impression that it is doing something more than pattern matching. I think this kind of communication where some human attribute is used to name some concept in the LLM domain is causing a lot of damage, and ends up inadvertently blowing up the hype for the AI marketing...
cyanydeez 12 hours ago
There was a paper recently that demonstrated that you can input different human languages and the middle layers of the model end up operating on the same probabilistic vectors. It's just the encoding/decoding layers that appear to do the language management.
So the conclusion was that these middle layers have their own language and it's converting the text into this language and this decoding it. It explains why sometime the models switch to chinese when they have a lot of chinese language inputs, etc.
- DrewADesign 12 hours ago
  
  Ok — that sounds more like a theory rather than an open-and-shut causal explanation, but I’ll read the paper.
  
  4 replies →
- skydhash 9 hours ago
  
  Pretty obvious when you think that neural networks operate with numbers and very complex formulas (by combining several simple formulas with various weights). You can map a lot of things to number (words, colors, music notes,…) but that does not means the NN is going to provide useful results.
  
  1 reply →
pennaMan 12 hours ago
>It’s literally text prediction, isn’t it?
you are discovering that the favorite luddite argument is bullshit
- ericjmorey 10 hours ago
  
  I don't consider these researchers luddites.
  https://machinelearning.apple.com/research/illusion-of-think...
  https://arxiv.org/abs/2508.01191
- DrewADesign 11 hours ago
  
  Feel free to elucidate if you want to add anything to this thread other than vibes.
  
  4 replies →

vova_hn2 11 hours ago

> instead of talk like a caveman you could turn off reasoning, with probably better results

This is not how the feature called "reasoning" work in current models.

"reasoning" simply let's the model output and then consume some "thinking" tokens before generating the actual output.

All the "fluff" tokens in the output have absolutely nothing to do with "reasoning".

throw83849494 11 hours ago

You obviously do not speak other languages. Other cultures have different constrains and different grammar.

For example thinking in modern US English generates many thoughts, to keep correct speak at right cultural context (there is only one correct way to say People Of Color, and it changes every year, any typo makes it horribly wrong).

Some languages are far more expressive and specialized in logical conditions, conditionals, recursion and reasoning. Like eskimos have 100 words for snow, but for boolean algebra.

It is well proven that thinking in Chinese needs far less tokens!

With this caveman mod you strip out most of cultural complexities of anglosphere, make it easier for foreigners and far simpler to digest.

suddenlybananas 11 hours ago
>Some languages are far more expressive and specialized in logical conditions, conditionals, recursion and reasoning. Like eskimos have 100 words for snow, but for boolean algebra.
This is simply not true.
- throw83849494 10 hours ago
  
  Well, just take varous english dialects you probably know, there are wast differences. Some strange languages do not even have numbers or recursion.
  It is very arrogant to assume, no other language can be more advanced than English.
- mylifeandtimes 11 hours ago
  
  Really? Because if one accepts that computer languages are languages, then it seems that we could identify one or two that are highly specialized in logical conditions etc. Prolog springs to mind.
  
  2 replies →