Comment by red75prime
1 month ago
> through business impacts which so far are very far outside of the scope of what an AI-assisted coding tool can comprehend.
That is, the problems are a) how to generate a training signal without formally verifiable results, b) hierarchical planning, c) credit assignment in a hierarchical planning system. Those problems are being worked on.
There are some preliminary research results that suggest that RL induces hierarchical reasoning in LLMs.
No comments yet
Contribute on Hacker News ↗