top
new
show
ask
jobs

Predicting When RL Training Breaks Chain-of-Thought Monitorability

5 hours ago (lesswrong.com)

0 comments

gmays

Reply

No comments yet

Contribute on Hacker News ↗

Slacker News

Product

API Reference
Hacker News RSS
Source on GitHub

Community

Support Ukraine
Equal Justice Initiative
GiveWell Charities