Comment by cold_harbor
4 days ago
LoRA won't fix the tokenization problem. Norwegian on a typical English-heavy BPE vocab uses 1.5-2x more tokens per word — that compounds into real inference cost, not just quality
4 days ago
LoRA won't fix the tokenization problem. Norwegian on a typical English-heavy BPE vocab uses 1.5-2x more tokens per word — that compounds into real inference cost, not just quality
No comments yet
Contribute on Hacker News ↗