Comment by HarHarVeryFunny

2 months ago

I'm not sure what you are saying. There are LLMs as exist today, and there are any number of changes one could propose to make to them.

The less you change, the more they stay the same. If you just add "more" RLVR (perhaps for a new domain - maybe chemistry vs math or programming?), then all you will get is an LLM that is better at acing chemistry reasoning benchmarks.

6 comments

HarHarVeryFunny

ACCount37 1 month ago

I'm saying that the kind of changes you propose aren't made by anyone, and might generally not be worth making. Because "better RLVR" is an easier and better pathway to actual cross-domain performance gains.

If you could stabilize the kind of mess you want to make, you could put that effort into better RL objectives and get more return.

HarHarVeryFunny 1 month ago
The mainstream LLM crowd aren't making these sorts of major changes yet, although some like DeepMind (the OG pushers of RL for AGI!) do acknowledge that a few more "transformer level" breakthoughs are necessary to reach what THEY are calling AGI, and others like LeCun are calling for more animal-like architectures.
Anyways, regardless of who is currently trying to move beyond LLMs or not, it should be pretty obvious what the problems are with trying to apply RL more generally, and what that would result in if successful, if that were the only change you made.
LLMs still have room to get better, but they will forever be LLMs, not brains, unless someone puts in the work to make that happen.
You started this thread talking about "AGI", without defining what you meant by that, and are now instead talking about "cross-domain performance gains". This is exactly why it makes no sense to talk about AGI without defining what you mean by it, since I think we talking about completely different things.
- ACCount37 1 month ago
  
  The claim I make is that LLMs can be AGI complete with pretty much zero architectural work. And none of the "brain-like" architectures are actually much better at "being a brain" than LLMs are - the issue isn't "the architecture is wrong", it's "we don't know how to train for this kind of thing".
  "Fundamental limitations" aren't actually fundamental. If you want more learning than what "in-context" gives you? Teach the usual "CLI agent" LLM to make its own LoRAs and there goes that. So far, this isn't a bottleneck so pressing you'd want to resolve it by force.
  LeCun is laughing stock nowadays, he didn't get kicked out of Meta for no reason.
  
  3 replies →