Comment by joelthelion
10 hours ago
> is just the next most likely token, taking into account temperature and what not.
This doesn't mean anything. All LLM output is like that.
That said, I agree that LLMs are terrible at grading stuff, except perhaps if you give them a very detailed evaluation grid.
No comments yet
Contribute on Hacker News ↗