Comment by sceptic123
3 months ago
People are focussing on chess, which is complicated, but LLM fail at even simple games like tic-tac-toe where you'd think, if it was capable of "reasoning" it would be able to understand where it went wrong. That doesn't seem to be the case.
What it can do is write and execute code to generate the correct output, but isn't that cheating?
Which SOTA LLM fails at tic-tac-toe?
I don't know, but it's not a hard test, get the LLM to play a perfect game of tic-tac-toe against itself, look at the output and see if it goes wrong.