Comment by prng2021

4 months ago

This was a great article. The section “Training for the next state prediction” explains a solution using subagents. If I’m understanding it correctly, we could test if that solution is directionally correct today, right? I ask a LLM a question. It comes up with a few potential responses but sends those first to other agents in a prompt with the minimum required context. Those subagents can even do this recursively a few times. Eventually the original agent collects and analyzes subagents responses and responds to me.

5 comments

prng2021

hrn_frs 4 months ago

Any attempt at world modeling using today's LLMs needs to have a goal function for the LLM to optimize. The LLM needs to build, evaluate and update it's model of the world. Personally, the main obstacle I found is in updating the model: Data can be large and I think that LLMs aren't good at finding correlations.

ethbr1 4 months ago
Isn't that just RL with extra power-intensive steps? (An entire model chugging away in the goal function)
- hrn_frs 4 months ago
  
  That's correct, but if successful you'd essentially have updated the LLM's knowledge and capabilities "on the fly".
  
  1 reply →

markab21 4 months ago

And I think you basically just described the OpenAI approach to building models and serving them.