Comment by simianwords

2 days ago

I don't buy this for the simple fact that benchmarks show much better performance on thinking than on non thinking models. Benchmarks already consider the generalisation and "unseen patterns" aspect.

What would be your argument against

1. COT models performing way better in benchmarks than normal models

2. people choose to use the COT models in day to day life because they actually find that it gives better performance