Slacker News Slacker News logo featuring a lazy sloth with a folded newspaper hat
  • top
  • new
  • show
  • ask
  • jobs
Library
← Back to context

Comment by throwup238

5 months ago

That’s kind of what (R/C)NNs did before the Attention is all you need paper introduced the attention mechanism. One of the breakthroughs that enabled GPT is giving each token equal “weight” through cross attention instead of letting them get attenuated in some sort of summarization mechanism.

0 comments

throwup238

Reply

No comments yet

Contribute on Hacker News ↗

Slacker News

Product

  • API Reference
  • Hacker News RSS
  • Source on GitHub

Community

  • Support Ukraine
  • Equal Justice Initiative
  • GiveWell Charities