Comment by tverbeure

1 month ago

I don't know, but model names such as "kimi-k2-thinking" in the test set might offset a clue.

Yes, there are some exceptions where it clearly states that a thinking model has been chosen like for kimi, but there is no such indicator for the GPT family from OpenAI and other major models.