Comment by CjHuber

16 days ago

It depends on the API path. Chat completions does what you describe, however isn't it legacy?

I've only used codex with the responses v1 API and there it's the complete opposite. Already generated reasoning tokens even persist when you send another message (without rolling back) after cancelling turns before they have finished the thought process

Also with responses v1 xhigh mode eats through the context window multiples faster than the other modes, which does check out with this.

3 comments

CjHuber

jumploops 15 days ago

That’s what I used to think, before chatting with the OAI team.

The docs are a bit misleading/opaque, but essentially reasoning persists for multiple sequential assistant turns, but is discarded upon the next user turn[0].

The diagram on that page makes it pretty clear, as does the section on caching.

[0]https://cookbook.openai.com/examples/responses_api/reasoning...

jswny 16 days ago

How do you know/toggle which API path you are using?