Comment by JamesSwift
3 hours ago
a9284923-141a-434a-bfbb-52de7329861d
d48d5a68-82cd-4988-b95c-c8c034003cd0
5c236e02-16ea-42b1-b935-3a6a768e3655
22e09356-08ce-4b2c-a8fd-596d818b1e8a
4cb894f7-c3ed-4b8d-86c6-0242200ea333
Amusingly (not really), this is me trying to get sessions to resume to then get feedback ids and it being an absolute chore to get it to give me the commands to resume these conversations but it keeps messing things up: cf764035-0a1d-4c3f-811d-d70e5b1feeef
Thanks for the feedback IDs — read all 5 transcripts.
On the model behavior: your sessions were sending effort=high on every request (confirmed in telemetry), so this isn't the effort default. The data points at adaptive thinking under-allocating reasoning on certain turns — the specific turns where it fabricated (stripe API version, git SHA suffix, apt package list) had zero reasoning emitted, while the turns with deep reasoning were correct. we're investigating with the model team. interim workaround: CLAUDE_CODE_DISABLE_ADAPTIVE_THINKING=1 forces a fixed reasoning budget instead of letting the model decide per-turn.