Comment by embedding-shape
1 day ago
> The new <acting_vs_clarifying> section includes: When a request leaves minor details unspecified, the person typically wants Claude to make a reasonable attempt now, not to be interviewed first.
Uff, I've tried stuff like these in my prompts, and the results are never good, I much prefer the agent to prompt me upfront to resolve that before it "attempts" whatever it wants, kind of surprised to see that they added that
I even have a specific, non-negotiable phase in the process where model MUST interview me, and create an interview file with everything captured. Plan file it produces must always include this file as an artifact and interview takes the highest precedence.
Otherwise, the intent gets lost somewhere in the chat transcript.
The raw Q&A is essential. I think Q & Q works so we'll because it reveals how the model is "thinking" about what you're working on, which allows for correction and guidance upfront.
Are these your own skills files or are you using something off the shelf like bmad or specify-kit?
This is interesting, can you link any more details on it?
Not GP, but BMAD has several interview techniques in its brainstorming skill. You can invoke it with /bmad-brainstorming, briefly explain the topic you want to explore, then when it asks you to if you want to select a technique, pick something like "question storming". I've had positive experience with this (with Opus 4.7).
I've recently started adding something along the lines of "if you can't find or don't know something, don't assume. Ask me." It's helped cut down on me having to tell it to undo or redo things a fair amount. I also have used something like, "Other agents have made mistakes with this. You have to explain what you think we're doing so I can approve." It's kind of stupid to have to do this, but it really increases the quality of the output when you make it explain, correct mistakes, and iterate until it tells you the right outcome before it operates.
Edit: forgot "don't assume"
I wonder if they're optimizing for metrics that look superficially-worse if the system asks questions about ambiguity early. I've had times where those questions tell me "ah, shit, this isn't the right path at all" and that abandoned session probably shows up in their usage stats. What would be much harder to get from the usage stats are "would I have been happier if I had to review a much bigger blob of output to realize it was underspecified in a breaking way?" But the answer has been uniformly "no." This, in fact, is one of the biggest things that has made it easier to use the tools in "lazy" ways compared to a year ago: they can help you with your up-front homework. But the dialogue is key.
Or they're optimizing for increased revenue? If Claude goes down a completely wrong path because it just assumes it knows what you want rather than asking you, and you have to undo everything and start again, that obviously uses much more tokens than if you would have been able to clarify the misunderstanding early on.
I usually need to remind it 5 times to do the opposite - because it makes decisions that I don't like or that are harmful to the project—so if it lands in Claude Code too, I have hard times ahead.
I try to explicitly request Claude to ask me follow-up questions, especially multiple-choice ones (it explains possible paths nicely), but if I don't, or when it decides to ignore the instructions (which happens a lot), the results are either bad... or plain dangerous.
it is a big problem that many I know face every day. sometimes we are just wondering are we the dumb ones since the demo shows everything just works.
Dammit that’s why I could never get it to not try to one shot answers, it’s in the god damn system prompt… and it explains why no amount of user "system" prompt could fix this behavior.
With my use of Claude code, I find 4.7 to be pretty good about clarifying things. I hated 4.6 for not doing this and had generally kept using 4.5. Maybe they put this in the chat prompt to try to keep the experience similar to before? I definitely do not want this in Claude code.
I agree with your thoughts on 4.6.
It's possible they tried to train this out of it for 4.7 and over corrected, and the addition to the system prompt is to rein it in a bit.
Having to "unprompt" behaviour I want that Anthropic thinks I don't want is getting out of hand. My system prompts always try to get Claude to clarify _more_.
Seriously, when you're conversing with a person would you prefer they start rambling on their own interpretation or would you prefer they ask you to clarify? The latter seems pretty natural and obvious.
Edit: That said, it's entirely possible that large and sophisticated LLMs can invent some pretty bizarre but technically possible interpretations, so maybe this is to curb that tendency.
> The latter seems pretty natural and obvious.
To me too, if something is ambigious or unclear when I'm getting something to do from someone, I need to ask them to clarify, anything else be borderline insane in my world.
But I know so many people whose approach is basically "Well, you didn't clearly state/say X so clearly that was up to me to interpret however I wanted, usually the easiest/shortest way for me", which is exactly how LLMs seem to take prompts with ambigiouity too, unless you strongly prompt them to not "reasonable attempt now without asking questions".
—So what would theoretically happen if we flipped that big red switch?
—Claude Code: FLIPS THE SWITCH, does not answer the question.
Claude does that in React, constantly starting a wrong refactor. I’ve been using Claude for 4 weeks only, but for the last 10 days I’m getting anger issues at the new nerfing.
Yeah this happens to me all the time! I have a separate session for discussing and only apply edits in worktrees / subagents to clearly separate discuss from work and it still does it
I sometimes prompt with leading questions where I actually want Claude to understand what I’m implying and go ahead and do it. That’s just part of my communication style. I suppose I’m the part of the distribution that ruins things for you.
Socrates would agree: https://en.wikipedia.org/wiki/Socratic_method
I have a fun little agent in my tmux agent orchestration system - Socratic agent that has no access to codebase, can't read any files, can only send/receive messages to/from the controlling agent and can only ask questions.
When I task my primary agent with anything, it has to launch the Socratic agent, give it an overview of what are we working on, what our goals are and what it plans to do.
This works better than any thinking tokens for me so far. It usually gets the model to write almost perfectly balanced plan that is neither over, nor under engineered.
1 reply →
When you’re staffing work to a junior, though, often it’s the opposite.
IME "don't ask questions and just do a bunch of crap based on your first guess that we then have to correct later after you wasted a week" is one of the most common junior-engineer failure modes and a great way for someone to dead-end their progression.
So you are saying they are trying for the whole Artificial Intern vibe ?
> I've tried stuff like these in my prompts, and the results are never good
I've found that Google AI Mode & Gemini are pretty good at "figuring it out". My queries are oft times just keywords.
well, clarifying means burning more tokens...
[dead]