Slacker News Slacker News logo featuring a lazy sloth with a folded newspaper hat
  • top
  • new
  • show
  • ask
  • jobs
Library
← Back to context

Comment by nullc

2 days ago

Publishing RL/SFT/self-distillation harnesses would be very impactful even without the data.

Particularly when it comes to tool use w/ self-distillation it can be done without any data... have a tool the model doesn't know? a teacher model RTFMs and the source code, and helps the student learn to get it right.

0 comments

nullc

Reply

No comments yet

Contribute on Hacker News ↗

Slacker News

Product

  • API Reference
  • Hacker News RSS
  • Source on GitHub

Community

  • Support Ukraine
  • Equal Justice Initiative
  • GiveWell Charities