Comment by hackinthebochs

6 days ago

Sure, that's true as well. But I don't see this as a substantive response given that the only people making unsupported claims in this thread are those trying to deflate LLM capabilities.

5 comments

hackinthebochs

datsci_est_2015 6 days ago

So, to review this thread

  - OP asked for someone to make a logical argument for the separation of “training” from “model”
  - I made the argument
  - You cherry picked an argument against my specific example and made an appeal to emergent complexity
  - I pointed out that emergent complexity isn’t limitless
  - “the only people making unsupported claims in this thread are those trying to deflate LLM capabilities”

famouswaffles 6 days ago
You made a pretty nonsensical argument, pretty much seems like the big standard for these arguments.
What does linear regression have to do with the limitations of a stacked transfer ? Absolutely nothing. This is the problem here. You don't know shit and just make up whatever. You can see people doing the same thing in GPT-1, 2, 3, 4 threads all telling us why LLMs will never be able to do thing it manages to do later.
- datsci_est_2015 6 days ago
  
  > You don’t know shit
  lol. Why so emotionally charged? Are you perhaps worried that you’ve invested too much time and effort into a technology that may not deliver what influencers have been promising for years? Like a proverbial bagholder?
  > What does linear regression have to do with the limitations of a stacked transfer ? Absolutely nothing. This is the problem here.
  We’re talking about fundamental concepts of modeling in this subthread. LLMs, despite what influencers may tell you, are simply models. I’ll even throw you a bone and admit they are models for intelligence. But they are still models, and therefore all of the things that we have learned about “models” since Plato are still relevant. Most importantly, since Plato we’ve known that “models” have fundamental limits vs. what they try to represent, otherwise they would be a facsimile, not a model.
  > You can see people doing the same thing in GPT-1, 2, 3, 4 threads all telling us why LLMs will never be able to do thing it manages to do later.
  I hope you enjoy winning these imaginary arguments against these imaginary comments. The fundamental limitations of LLMs discussed since GPT-1 have never been addressed by changing the architecture of the underlying model. All of the improvements we’ve experienced have been due to (1) improvements in training regime and (2) harnesses / heuristics (e.g. Agents).
  Now, care to provide a counterargument that shows you know a little more than “shit”?
  
  1 reply →
hackinthebochs 6 days ago

To be clear, you are confusing me with other commenters in this thread. All I want is for those that liken LLMs to stochastic parrots and other deflationary claims to offer an argument that engages with the actual structure of LLMs and what we know about them. No one seems to be up to that challenge. But then I can't help but wonder where people's confident claims come from. I'm just tired of the half-baked claims and generic handwavy allusions that do nothing but short-circuit the potential for genuine insight.