Comment by Terr_
5 hours ago
The point isn't that they fail at the task.
The point is that if the model were really "reasoning", it would fail differently. Instead, what happens is consistent with it BSing on a textual level.
5 hours ago
The point isn't that they fail at the task.
The point is that if the model were really "reasoning", it would fail differently. Instead, what happens is consistent with it BSing on a textual level.
No comments yet
Contribute on Hacker News ↗