Comment by stingraycharles

3 months ago

That’s without reasoning I presume?

3 comments

stingraycharles

4.6 Opus with extended thinking just now: "At 50 meters, just walk. By the time you start the car, back out, and park again, you'd already be there on foot. Plus you'll need to leave the car with them anyway."

gf000 3 months ago

Not the parent poster, but I did get the wrong answer even with reasoning turned on.

tezza 3 months ago

Thank you all! We needed further data points.
comparing one shot results is a foolish way to evaluate a statistical process like LLM answers. we need multiple samples.
for https://generative-ai.review I do at least three samples of output. this often yields very differnt results even from the same query.
e.g: https://generative-ai.review/2025/11/gpt-image-1-mini-vs-gpt...