Comment by topaz0

1 year ago

Still an interesting direction of questioning. Maybe could be rephrased as "how much work is the grammar doing"? Are the results with the grammar very different than without? If/when a grammar is not used (like in the openai case), how many illegal moves does it try on average before finding a legal one?

4 comments

topaz0

int_19h 1 year ago

A grammar is really just a special case of the more general issue of how to pick a single token given the probabilities that the model spits out for every possible one. In that sense, filters like temperature / top_p / top_k are already hacks that "do the work" (since always taking the most likely predicted token does not give good results in practice), and grammars are just a more complicated way to make such decisions.

gs17 1 year ago

I'd be more interested in what the distribution of grammar-restricted predictions looks like compared to moves Stockfish says are good.

Jerrrrrrry 1 year ago

an LLM would complain that their internal model does not refelct their current input/output.

Since LLM's knows people knock off/test/run afoul/mistakes can be made, it would then raise that as a possibility and likely inquire.

causal 1 year ago

This isn't prompt engineering, it's grammar-constrained decoding. It literally cannot respond with anything but tokens that fulfill the grammar.