Comment by prng2021
18 days ago
This was a great article. The section “Training for the next state prediction” explains a solution using subagents. If I’m understanding it correctly, we could test if that solution is directionally correct today, right? I ask a LLM a question. It comes up with a few potential responses but sends those first to other agents in a prompt with the minimum required context. Those subagents can even do this recursively a few times. Eventually the original agent collects and analyzes subagents responses and responds to me.
Any attempt at world modeling using today's LLMs needs to have a goal function for the LLM to optimize. The LLM needs to build, evaluate and update it's model of the world. Personally, the main obstacle I found is in updating the model: Data can be large and I think that LLMs aren't good at finding correlations.
Isn't that just RL with extra power-intensive steps? (An entire model chugging away in the goal function)
That's correct, but if successful you'd essentially have updated the LLM's knowledge and capabilities "on the fly".
1 reply →
And I think you basically just described the OpenAI approach to building models and serving them.