Slacker News Slacker News logo featuring a lazy sloth with a folded newspaper hat
  • top
  • new
  • show
  • ask
  • jobs
Library
← Back to context

Comment by squiggleblaz

9 months ago

Reinforcement learning, maximise rewards? They work because rabbits like carrots. What does an LLM want? Haven't we already committed the fundamental error when we're saying we're using reinforcement learning and they want rewards?

0 comments

squiggleblaz

Reply

No comments yet

Contribute on Hacker News ↗

Slacker News

Product

  • API Reference
  • Hacker News RSS
  • Source on GitHub

Community

  • Support Ukraine
  • Equal Justice Initiative
  • GiveWell Charities