Slacker News Slacker News logo featuring a lazy sloth with a folded newspaper hat
  • top
  • new
  • show
  • ask
  • jobs
Library
← Back to context

Comment by MattRix

7 hours ago

I think this is because when you shrink it down, the model ends up space constrained and each “neuron” ends up having to do multiple duties. It can stil be tuned to perform well at specific tasks, but no longer generalizes as well. It’s somewhat unintuitive but models that are larger are often simpler than smaller ones for this same reason.

0 comments

MattRix

Reply

No comments yet

Contribute on Hacker News ↗

Slacker News

Product

  • API Reference
  • Hacker News RSS
  • Source on GitHub

Community

  • Support Ukraine
  • Equal Justice Initiative
  • GiveWell Charities