Comment by thomascountz
12 hours ago
Yes, you can use constrained decoding like logit masking to force all invalid tokens in the vocabulary to -inf, and effectively be removed from selection. I believe llama.cpp exposes this by accepting a formatted grammar.
No comments yet
Contribute on Hacker News ↗