← Back to context

Comment by staticman2

1 year ago

Well splitting into letters doesn't help with Mistral Large 2. I tried this with Mistral 2:

[Question: How many r's in strawberry? First split the word into letters, one letter on each line then answer the question.]

[Answer: Let's split the word "strawberry" into letters:

s t r a w b e r r y

Now, let's count the number of 'r's:

There are 2 'r's in the word "strawberry".]

[Question: Did you count the number of r's?]

[answer: Yes, I did. Here's the count:

r r

So, there are 2 'r's in the word "strawberry".]

I suspect the issue is these models have to be trained to accomplish tasks, and this is not a task it makes sense to train a model to perform. You might train it to do this with function calling one day, calling a python library to count the r's in a string, but actually manually counting r's just doesn't play to an LLM's strengths.