Comment by threethirtytwo

1 month ago

> It's pretty clear you don't have a solid background in generative models, because this is fundamentally what they do

You don’t have a solid background. No one does. We fundamentally don’t understand LLMs, this is an industry and academic opinion. Sure there are high level perspectives and analogies we can apply to LLMs and machine learning in general like probability distributions, curve fitting or interpolations… but those explanations are so high level that they can essentially be applied to humans as well. At a lower level we cannot describe what’s going on. We have no idea how to reconstruct the logic of how an LLM arrived at a specific output from a specific input.

It is impossible to have any sort of deterministic function, process or anything produce new information from old information. This limitation is fundamental to logic and math and thus it will limit human output as well.

You can combine information you can transform information you can lose information. But producing new information from old information from deterministic intelligence is fundamentally impossible in reality and therefore fundamentally impossible for LLMs and humans. But note the keyword: “deterministic”

New information can literally only arise through stochastic processes. That’s all you have in reality. We know it’s stochastic because determinism vs. stochasticism are literally your only two viable options. You have a bunch of inputs, the outputs derived from it are either purely deterministic transformations or if you want some new stuff from the input you must apply randomness. That’s it.

That’s essentially what creativity is. There is literally no other logical way to generate “new information”. Purely random is never really useful so “useful information” arrives only after it is filtered and we use past information to filter the stochastic output and “select” something that’s not wildly random. We also only use randomness to perturb the output a little bit so it’s not too crazy.

In the end it’s this selection process and stochastic process combined that forms creativity. We know this is a general aspect of how creativity works because there’s literally no other way to do it.

LLMs do have stochastic aspects to them so we know for a fact it is generating new things and not just drawing on the past. We know it can fit our definition of “creative” and we can literally see it be creative in front of your eyes.

You’re ignoring what you see with your eyes and drawing your conclusions from a model of LLMs that isn’t fully accurate. Or you’re not fully tying the mechanisms of how LLMs work with what creativity or generating new data from past data is in actuality.

The fundamental limitation with LLMs is not that it can’t create new things. It’s that the context window is too small to create new things beyond that. Whatever it can create it is limited to the possibilities within that window and that sets a limitation on creativity.

What you see happening with LEAN can also be an issue with the context window being too small. If we have an LLM with a giant context window bigger than anything before… and pass it all the necessary data to “learn” and be “trained” on lean it can likely start to produce new theorems without literally being “trained”.

Actually I wouldn’t call this a “fundamental” problem. More fundamental is the aspect of hallucinations. The fact that LLMs produce new information from past information in the WRONG way. Literally making up bullshit out of thin air. It’s the opposite problem of what you’re describing. These things are too creative and making up too much stuff.

We have hints that LLMs know the difference between hallucinations and reality but coaxing it to communicate that differentiation to us is limited.

4 comments

threethirtytwo

jheez3 1 month ago

"You don’t have a solid background.

If you want to go around huffing and puffing your chest about a subject area, you kinda do fella. Credibility.

threethirtytwo 1 month ago
Not only is what he saying in direct contradiction to what people with credibility have said, but his claimed credentials can be utter bullshit.
This is the internet bro. Credibility is irrelevant because identities can never be verified. So the only thing that matters is the strength and rationality of an argument.
That’s the point of hacker news substantive content not some battle of comparison of credentials or useless quips (like yours) with zero substance. Say something worth reading if you have anything to say at all, otherwise nobody cares.
- jheez3 1 month ago
  
  [flagged]
  
  1 reply →