Reinforcement Learning from Human Feedback 12 hours ago (rlhfbook.com) https://arxiv.org/abs/2504.12501 6 comments onurkanbkrc Reply Add to library dang 7 hours ago Related. Others?RLHF Book - https://news.ycombinator.com/item?id=42902936 - Feb 2025 (37 comments) verdverm 10 hours ago Last time I saw Nathan say something about the book, he's actively working on the next version and looking for feedback, check his socials leggerss 8 hours ago You could say he's also learning from human feedback klelatti 11 hours ago Web version with links, etc:https://rlhfbook.com/ dang 7 hours ago Thanks! We've switched to that above from https://arxiv.org/abs/2504.12501, and put the latter in the toptext. iisweetheartii 11 hours ago [dead]
dang 7 hours ago Related. Others?RLHF Book - https://news.ycombinator.com/item?id=42902936 - Feb 2025 (37 comments)
verdverm 10 hours ago Last time I saw Nathan say something about the book, he's actively working on the next version and looking for feedback, check his socials leggerss 8 hours ago You could say he's also learning from human feedback
klelatti 11 hours ago Web version with links, etc:https://rlhfbook.com/ dang 7 hours ago Thanks! We've switched to that above from https://arxiv.org/abs/2504.12501, and put the latter in the toptext.
dang 7 hours ago Thanks! We've switched to that above from https://arxiv.org/abs/2504.12501, and put the latter in the toptext.
Related. Others?
RLHF Book - https://news.ycombinator.com/item?id=42902936 - Feb 2025 (37 comments)
Last time I saw Nathan say something about the book, he's actively working on the next version and looking for feedback, check his socials
You could say he's also learning from human feedback
Web version with links, etc:
https://rlhfbook.com/
Thanks! We've switched to that above from https://arxiv.org/abs/2504.12501, and put the latter in the toptext.
[dead]