Comment by acuozzo
4 hours ago
> Did you have a runnable WolframLanguage file so it can compare results?
Yes.
> Did you give it H100 / H200 access to compile and then iterate?
Yes via Lambda.ai. Also, FWIW, I run claude with --dangerously-skip-permissions and codex with the equivalent flag.
> it does amazing kernel work (Codex-5.4)
Specifically with WGMMA + TMA?
---
Once TMA gets involved both Claude and Codex spin endlessly until they dump TMA for a slower fallback.
I've observed this with Claude-Code having Opus 4.6 reasoning set to medium, high, and max; "adaptive thinking" enabled and disabled; and I've made sure to max-out thinking tokens.
I've also observed this with Codex GPT-5.4 in addition to GPT-5.3-Codex with reasoning efforts from medium to xhigh.
---
I've also observed this on the web, as mentioned in my OP, with GPT-5.4pro (Extended Pro), Gemini3-DeepThink, and Opus 4.6.
That is informative, thanks! Yes, I observe the same thing as the model tends to give up (like you said, "dump TMA for a slower fallback") and needs active steering to get good results. But it indeed works further than one-shot from Chat interface and knows much more about profiling / kernel coding than these.