Comment by IanCal

6 months ago

Ish - it always depends how deep in the weeds you need to get. Tokenisation impacts performance, both speed and results, so details can be important.

5 comments

IanCal

refulgentis 6 months ago

I maintain a llama.cpp wrapper, on everything from web to Android and cannot quite wrap my mind around if you'd have any more info by getting individual token IDs from the API, beyond what you'd get from wall clock time and checking their vocab.

lqstuart 6 months ago
I don’t really see a need for token IDs alone, but you absolutely need per-token logprob vectors if you’re trying to do constrained decoding
- refulgentis 6 months ago
  
  Interesting point, my first reaction was "why do you need logprobs? We use constrained decoding for tool calls and don't need them"...which is actually false! Because we need to throw out those log probs then find the highest log prob of a token meeting the constraints.
  
  1 reply →
IanCal 6 months ago

Do we have the vocab? That's part of the point here. Does it take images? How are they tokenised?