Comment by pugworthy
2 days ago
To me, this doesn't show the weakness of current models, it shows the variability of prompts and the influence on responses. Because without the prompt it's hard to tell what influenced the outcome.
I had this long discussion today with a co-worker about the merits of detailed queries with lots of guidance .md documents, vs just asking fairly open ended questions. Spelling out in great detail what you want, vs just generally describing what you want the outcomes to be in general then working from there.
His approach was to write a lot of agent files spelling out all kinds of things like code formatting style, well defined personas, etc. And here's me asking vague questions like, "I'm thinking of splitting off parts of this code base into a separate service, what do you think in general? Are there parts that might benefit from this?"
It is definitely a weakness of current models. The fact that people find ways around those weaknesses does not mean the weaknesses do not exist.
Your approach is also very similar to spec driven development. Your spec is just a conversation instead of a planning document. Both approaches get ideas from your brain into the context window.
So which approach worked better?
Challenging to answer, because we're at different levels of programming. I'm Senior / Architect type with many years of experience programming, and he's an ME using code to help him with data processing and analysis.
I have a hunch if you asked which approach we took based on background, you'd think I was the one using the detailed prompt approach and him the vague.