Comment by lukev
1 day ago
Think about it this way; they are encoding whole "thoughts" or "ideas" as single tokens.
It's effectively a multimodal model, which handles "concept" tokens alongside "language" tokens and "image" tokens.
A really big conceptual step, actually, IMO.
No comments yet
Contribute on Hacker News ↗