← Back to context

Comment by red75prime

1 month ago

> through business impacts which so far are very far outside of the scope of what an AI-assisted coding tool can comprehend.

That is, the problems are a) how to generate a training signal without formally verifiable results, b) hierarchical planning, c) credit assignment in a hierarchical planning system. Those problems are being worked on.

There are some preliminary research results that suggest that RL induces hierarchical reasoning in LLMs.