Comment by minimaxir
14 hours ago
Which is why in the linked post, I test models against both the "r's in strawberries" and the "b's in blueberries" to see if that is the case.
tl;dr the first case had near perfect accuracy as expected for the case if the LLMs were indeed trained on it. The second case did not.
No comments yet
Contribute on Hacker News ↗