Comment by danielcampos93

5 hours ago

I would love to know what the increased token count is across these models for the benchmarks. I find the models continue to get better but as they do their token usage also does. Aka is model doing better or reasoning for longer?

I think that is always something that is being worked on in parallel. Recent paradigm seems to be the models understanding when they need to use more tokens dynamically (which seems to be very much in line with how computation should generally work).