Comment by KoolKat23

7 days ago

I'd say we're not far off.

Looking at the human side, it takes a while to actually learn something. If you've recently read something it remains in your "context window". You need to dream about it, to think about, to revisit and repeat until you actually learn it and "update your internal model". We need a mechanism for continuous weight updating.

Goal-generation is pretty much covered by your body constantly drip-feeding your brain various hormones "ongoing input prompts".

13 comments

KoolKat23

onemoresoop 7 days ago

> I'd say we're not far off.

How are we not far off? How can LLMs generate goals and based on what?

FeepingCreature 7 days ago
You just train it on the goal. Then it has that goal.
Alternately, you can train it on following a goal and then you have a system where you can specify a goal.
At sufficient scale, a model will already contain goal-following algorithms because those help predict the next token when the model is basetrained on goal-following entities, ie. humans. Goal-driven RL then brings those algorithms to prominence.
- kordlessagain 7 days ago
  
  Random goal use is showing to be more important than training. Although, last year someone trained on the fly during the competition, which is pretty awesome when you think about it.
- kelseyfrog 7 days ago
  
  How do you figure goal generation and supervised goal training are interchangeable?
  
  3 replies →
NetRunnerSu 7 days ago
Minimize prediction errors.
- tsurba 7 days ago
  
  But are we close to doing that in real-time on any reasonably large model? I don’t think so.
  
  2 replies →

NetRunnerSu 7 days ago

Yes, you're right, that's what we're doing.

https://github.com/dmf-archive/PILF

KoolKat23 7 days ago

Very interesting, thanks for the link.