Comment by jncfhnb

2 months ago

I don’t find the explanation about tokenization to be very compelling.

I don’t see any particular reason the LLM shouldn’t be able to extract the implications about spelling just because its tokens of “straw” and “berry”

Frankly I think that’s probably misleading. Ultimately the problem is that the LLM doesn’t do meta analysis of the text itself. That problem probably still exists in various forms even if its character level tokenization. Best case it manages to go down a reasoning chain of explicit string analysis.

0 comments

jncfhnb

No comments yet

Contribute on Hacker News ↗