Comment by zyklu5
2 months ago
Their explanation for why their idea (SSD) might work - precision-exploration conflict hypothesis - is something adaptive decoding also tries to solve.
https://ai.meta.com/research/publications/adaptive-decoding-...
2 months ago
Their explanation for why their idea (SSD) might work - precision-exploration conflict hypothesis - is something adaptive decoding also tries to solve.
https://ai.meta.com/research/publications/adaptive-decoding-...
I've been wondering about adaptive decoding! It seems obvious to me that at some points during decoding (reasoning, "creative thinking") you would want a higher temperature, while at other points (emitting syntactically correct code, following a plan that was already established) you would want lower temperature.
Seems similar to entropix?
https://github.com/xjdr-alt/entropix