Comment by datpuz

10 months ago

I think your prompt is bad. Still impressive that Claude 3.7 handled your bad prompt, but qwen3 had no problem with this prompt:

Create a Python decorator that registers functions as handlers for MQTT topic patterns (including + and # wildcards). Internally, use a trie to store the topic patterns and match incoming topic strings to the correct handlers. Provide an example showing how to register multiple handlers and dispatch a message to the correct one based on an incoming topic.

2 comments

datpuz

rcarmo 10 months ago

I purposefully used exactly the same thing I did with Claude and Gemini to see how the models dealt with ambiguity. It shouldn't have degraded the chain of thought to the point where it starts looping.

101011 9 months ago

The trick shouldn't be to try and generate a litmus test for agentic development, it's to change your workflow to game-plan solutions and decompose problems (like you would a jira epic to stories), and THEN have it build something for you.