Comment by ipunchghosts

8 days ago

Yes. Happy to chat if u msg me. Using RL coupled with NNs to integrate search directly into inference instead of as an afterthought like Chain of though and test time training.