← Back to context

Comment by CuriouslyC

8 hours ago

You should repeat this experiment but with progressively more detail in the initial prompt. Claude's secret sauce is taking weakly specified prompts and making passable things from them, but as the degrees of freedom in the prompt go down Claude starts to disobey while other models close in on the intent.

That is a great suggestion that I am definitely going to look into, thanks!

  • Nice comparison, but perhaps a more informative one would be to keep the harness the same and use Claude Code for both model. In your comparison, the differences could be due to many harness design decisions.