← Back to context

Comment by irickt

4 months ago

HN as huge RLHF data source for our behavior refinement . Yum!

(Reinforcement learning from human feedback)

0 comments

irickt

Reply

No comments yet

Contribute on Hacker News ↗