Comment by tcdent

3 days ago

This style of prompting, where you set up a dire scenario in order to try to evoke some "emotional" response from the agent, is already dated. At some point, putting words like IMPORTANT in all uppercase had some measurable impact, but at the present time, models just follow instructions.

Save yourself the experience of having to write and maintain prompts like this.

10 comments

tcdent

bcoates 3 days ago

Also the persuasion paper he links isn't at all about what he's talking about.

That paper is about using persuasion prompts to overcome trained in "safety" refusals, not to improve prompt conformance.

danshapiro 2 days ago
Co-Author of the paper here. We don't know exactly why modern llms don't want to call you a jerk, or for that matter why persuasive techniques convince them otherwise. it's not a hard line like many of the guardrails. That said, I talked to Jesse about this, and I strongly suspect the same techniques will work for prompt conformance when the topic is something other than name calling.
- diamond559 2 days ago
  
  It's bc they are programmed to be agreeable and friendly so that you'll keep using them.
- make3 2 days ago
  
  isn't that just instruction fine tuning and rlhf inducing style & deference? why is that surprising

kasey_junk 3 days ago

What’s irritating is that the llms haven’t learned this as bout themselves yet. If you ask an llm to improve its instructions those sort of improvements are what it will suggest.

It is the thing I find most irritating about working with llms and agents. They seem forever a generation behind in capabilities that are self referential.

danielbln 3 days ago
LLMs will also happily put time estimates on work packages that are based on ore-LLM turn around times.
"Phase 2 will take about one week"
No, Claude, it won't, because you you and I will bang this thing out in a few hours.
- mceachen 3 days ago
  
  "Refrain from including estimated task completion times." has been in my ~/.claude/CLAUDE.md for a while. It helps.
  
  2 replies →
conorcleary 2 days ago

Comments like yours on posts like these by humans like us will create a philosophical lens out of the ether that future LLMs will harvest for free and then paywall.