Slacker News Slacker News logo featuring a lazy sloth with a folded newspaper hat
  • top
  • new
  • show
  • ask
  • jobs
Library
← Back to context

Comment by jkhdigital

6 years ago

Yes, this is why I said "basically". The fact that GPT-2 tokens are not necessarily prefix-free can be a problem for arithmetic coding, but I've found that "greedy" parsing almost never fails in practice.

So yes, there are ways to work around this but it seems like the simplest explanation for why unusual words break the encoder.

0 comments

jkhdigital

Reply

No comments yet

Contribute on Hacker News ↗

Slacker News

Product

  • API Reference
  • Hacker News RSS
  • Source on GitHub

Community

  • Support Ukraine
  • Equal Justice Initiative
  • GiveWell Charities