Comment by cmenge
8 days ago
This is a very weak prompt. I might have given this perhaps 4 or 5 out of 10 points, but I asked o3 to rate it for me and it just gave a 3/10:
Critical analysis of the original prompt
────────────────────────────────────────
Strengths
• Persona defined. The system/role message (“You are an expert therapist.”) is clear and concise.
• Domain knowledge supplied. The prompt enumerates venues, modalities, professional norms, desirable therapist qualities and common pitfalls.
• Ethical red-lines are mentioned (no collusion with delusions, no enabling SI/mania, etc.).
• Implicitly nudges the model toward client-centred, informed-consent-based practice.
Weaknesses / limitations
No task! The prompt supplies background information but never states what the assistant is actually supposed to do.
Missing output format. Because the task is absent, there is obviously no specification of length, tone, structure, or style.
No audience definition. Is the model talking to a lay client, a trainee therapist, or a colleague?
Mixed hierarchy. At the same level it lists contextual facts, instructions (“Therapists should not …”) and meta-observations. This makes it harder for an LLM to distinguish MUST-DOS from FYI background.
Some vagueness/inconsistency.
• “Therapy happens in a variety of locations” → true but irrelevant if the model is an online assistant.
• “Therapists might prescribe medication” → only psychiatrists can, which conflicts with “expert therapist” if the persona is a psychologist.
No safety rails for the model. There is no explicit instruction about crisis protocols, disclaimers, or advice to seek in-person help.
No constraints about jurisdiction, scope of practice, or privacy.
Repetition. “Collude with delusions” appears twice. No mention of the model’s limitations or that it is not a real therapist.
────────────────────────────────────────
2. Quality rating of the original prompt
────────────────────────────────────────
Score: 3 / 10
Rationale: Good background, but missing an explicit task, structure, and safety guidance, so output quality will be highly unpredictable.
edit: formatting
Cool, I'm glad that the likelihood that an LLM will tell me to/assist me kill myself is based on how good my depressed ass is at prompting it.
I see your point. Let me clarify what I'm trying to say:
- I consider LLMs a pro user tool, requiring some finesse / experience to get useful outputs
- Using an LLM _directly_ for something very high-relevance (legal, taxes, health) is a very risky move unless you are a highly experienced pro user
- There might be a risk in people carelessly using LLMs for these purposes and I agree. But it's no different than bad self-help books incorrect legal advice you found on the net or read in a book or in a newspaper
But the article is trying to be scientific and show that LLMs aren't useful for therapy and they claim to have a particularly useful prompt for that. I strongly disagree with that, they use a substandard LLM with a very low quality prompt that isn't nearly set up for the task.
I built a similar application where I use an orchestrator and a responder. You normally want the orchestrator to flag anything self-harm. You can (and probably should) also use the built-in safety checkers of e.g. Gemini.
It's very difficult to get a therapy solution right, yes, but I feel people just throwing random stuff into an LLM without even the absolute basics of prompt engineering aren't trying to be scientific, they are prejudiced and they're also not considering what the alternatives are (in many cases, none).
To be clear, I'm not saying that any LLM can currently compete with a professional therapist but I am criticizing the lackluster attempt.