Comment by AlfeG

5 months ago

Ahha, recently my daugher come to me with 3rd grade math problem. "Without rearranging the digits 1 2 3 4 5, insert mathematical operation signs and, if necessary, parentheses between them so that the resulting expression equals 40 and 80. The key is that you can combine digits (like 12+3/45) but you cannot change their order from the original sequence 1,2,3,4,5"

Grok3, Claude, Deepseek, Qwen all failed to solve this problem. Resulting in some very very wrong solutions. While Grok3 were admit it fail and don't provide answers all other AI's are provided just plain wrong answers, like `12 * 5 = 80`

ChatGPT were able to solve for 40, but not able to 80. YandexGPT solved those correctly (maybe it were trained on same Math books)

Just checked Grok3 few more times. It were able to solve correctly for 80.

Geez. Who teaches this 3rd-grade class, Prof. Xavier?

Interestingly, the R1 1.58-bit dynamic quant model was able to sort of solve it. I changed the problem statement a bit to request only the solution for 40 and to tell it what operations it can use, both needed to keep from blowing out the limited context available on my machine (128MB RAM + 24MB GPU).

Took almost 3 hours and it wigged out a bit at the end, rambling about Lisp in Chinese, but it got an almost-valid answer: 1 * (2 + 3) * (4 + 5) - 5 (https://pastebin.com/ggL85RWJ) I didn't think it would get that far.

Neither Claude Sonet 3.5 or 3.7 could solve this correctly unless you add to the prompt “ Prove it with the js analysis tool, please use an efficient combinatorial algorithm to find the solution”… and I had to correct 3.7 because it was not following the instructions as 3.5 did

o3-mini-high solves this correctly:

```

We can “stick‐to the order” of the digits and allow concatenation. For example, one acceptable answer is

  40:  1 – 2 × 3 + 45    because 1 – (2×3) + 45 = 1 – 6 + 45 = 40

and another is

  80:  12 ÷ 3 × 4 × 5    because 12÷3 = 4, then 4×4×5 = 16×5 = 80

In both cases the digits 1,2,3,4,5 appear in order without rearrangement.

```

However, it took 8 minutes to produce that.

This is what they are expecting 3rd graders to solve in math? Pretty hard for that age?

  • They have a lessons before on order of the expressions and some similiar problems. They were able to solve for 80, but stuck on 40 and asked me.