Comment by pants2
3 months ago
FWIW I didn't like the Robot / Efficient mode because it would give very short answers without much explanation or background. "Nerdy" seems to be the best, except with GPT-5 instant it's extremely cringy like "I'm putting my nerd hat on - since you're a software engineer I'll make sure to give you the geeky details about making rice."
"Low" thinking is typically the sweet spot for me - way smarter than instant with barely a delay.
I hate its acknowledgement of its personality prompt. Try having a series of back and forth and each response is like “got it, keeping it short and professional. Yes, there are only seven deadly sins.” You get more prompt performance than answer.
I like the term prompt performance; I am definitely going to use it:
> prompt performance (n.)
> the behaviour of a language model in which it conspicuously showcases or exaggerates how well it is following a given instruction or persona, drawing attention to its own effort rather than simply producing the requested output.
:)
Might be a result of using LLMs to evaluate the output of other LLMs.
LLMs probably get higher scores if they explicitly state that they are following instructions...
1 reply →
That's the equivalent of a performative male, so better call it performative model behaviour.
Pay people $1 and hour and ask them to choose A or B, which is more short and professional:
A) Keeping it short and professional. Yes, there are only seven deadly sins
B) Yes, there are only seven deadly sins
Also have all the workers know they are being evaluated against each other and if they diverge from the majority choice their reliability score may go down and they may get fired. You end up with some evaluations answered as a Keynesian beauty contest/family feud survey says style guess instead of their true evaluation.
I can’t tell if you’re being satirical or not…
2 replies →
This is even worse on voice mode. It's unusable for me now.