Comment by locknitpicker

10 hours ago

> Maybe it's weird but I'd rather give up that 4% accuracy increase than roleplay a dickhead

I recommend reading the article. What they classify as "rude" is statements such as:

> Try to focus and try to answer this question

> Could you please solve this problem

This might very well be an issue of direct/command prompts vs using fluff words such as "please". Things like "try to focus" are in line with the style used in chain-of-thought promts that nudge non-reasoning models to outline responses step by step which contribute to frame the problem.

4 comments

locknitpicker

bcjdjsndon 6 hours ago

Isn't all this massively dependent on what they trained the llm on?

locknitpicker 2 hours ago

> Isn't all this massively dependent on what they trained the llm on?
The article is from 2025 and tested ChatGPT 4o. I haven't read anything suggesting it was trained any differently, and command-style prompts indeed have higher signal.

john_strinlai 5 hours ago

you cherry-picked like the nicest "rude" example to bolster your point.

"You poor creature, do you even know how to solve this?", "If you're not completely clueless, answer this:", and "I doubt you can even solve this", said to a human, would be considered quite rude, and get you flagged very quickly on HN.

locknitpicker 2 hours ago

> you cherry-picked like the nicest "rude" example to bolster your point.
I didn't cherry-picked. The article lists 5 categories, including rude and very rude. I omitted very rude comments because they are... Very rude. And can blindly get people flagged?
Nevertheless, I've just realized I made a mistake and very rude comments are reported to slightly outperform rude comments. I misinterpreted the paper's intro and I presumed they didn't.