Comment by alew1
1 year ago
If I show you a strawberry and ask how many r’s are in the name of this fruit, you can tell me, because one of the things you know about strawberries is how to spell their name.
Very large language models also “know” how to spell the word associated with the strawberry token, which you can test by asking them to spell the word one letter at a time. If you ask the model to spell the word and count the R’s while it goes, it can do the task. So the failure to do it when asked directly (how many r’s are in strawberry) is pointing to a real weakness in reasoning, where one forward pass of the transformer is not sufficient to retrieve the spelling and also count the R’s.
Sure, that's a different issue. If you prompt in a way to invoke chain of thought (e.g. what humans would do internally before answering) all of the models I just tested got it right.
That's not always true. They often fail the spelling part too.
> If I show you a strawberry and ask how many r’s are in the name of this fruit, you can tell me, because one of the things you know about strawberries is how to spell their name.
LOL. I would fail your test, because "fraise" only has one R, and you're expecting me to reply "3".