← Back to context

Comment by jrahmy

20 hours ago

A tokenizer is a deterministic string-matching program, it's not made out of weights in the same sense as a neural network itself.

But it could be. It's just less efficient.

  • I don't see how. You could ask a neural network to do the tokenization I suppose, but in doing so you'd have to convert the prompt into tokens via the same deterministic process the network was trained on, essentially just moving the exact same process up one layer.