Comment by vanviegen

4 months ago

> And yet... now many of them can do it.

Presumably because they trained them to death on this useless test that people somehow just wouldn't shut up about.

1 comment

vanviegen

Which is why in the linked post, I test models against both the "r's in strawberries" and the "b's in blueberries" to see if that is the case.

tl;dr the first case had near perfect accuracy as expected for the case if the LLMs were indeed trained on it. The second case did not.