Comment by hippo22
1 day ago
Couldn’t this be solved by replacing the tokenized input with a model that outputs the tokenization and then training the entire thing as one larger model? The goal would be to make tokenization a function of the model input.
No comments yet
Contribute on Hacker News ↗