Comment by watwut
3 days ago
That still sounds like a dumb strategy. Or, more likely, post hoc rationalization.
You reward me for wasting tokens and punish me for not wasting them, I will maximally waste them and wont "explore hownto make them useful". The latter wastes less tokens and that is punished.
No comments yet
Contribute on Hacker News ↗