Comment by simonw
4 hours ago
Have you tried telling it to run a script to verify that the YAML is valid? I imagine it could do that with Python.
4 hours ago
Have you tried telling it to run a script to verify that the YAML is valid? I imagine it could do that with Python.
It gets it wrong 100% of the time. A script to validate would send it into an infinite loop of generating code and failing validation.
Are you sure about that?
I don't think I've ever seen Opus 4.5 or GPT-5.2 get stuck in a loop like that. They're both very good at spotting when something doesn't work and trying something else instead.
Might be a problem with older, weaker models I guess.
I’m limited on the tools and models I can use due to privacy restrictions at work.