Comment by naasking
3 months ago
Using pixels is still tokenizing. What's needed is something more like "Byte Latent Transformers", which has dynamically sized patches based on information content rather than tokens.
3 months ago
Using pixels is still tokenizing. What's needed is something more like "Byte Latent Transformers", which has dynamically sized patches based on information content rather than tokens.
No comments yet
Contribute on Hacker News ↗