Comment by vidarh
6 days ago
That sounds at least partly like a prompting issue to me. I have no problem getting scathing critiques out of either, by defining the role I want them to take clearly.
Here's part of an initial criticism Claude made of your comment (it also said nice things):
"However, the prose suffers from structural inconsistencies. The opening sentence contains an awkward parenthetical insertion that disrupts flow, and the second sentence uses unclear pronoun reference with "This" and "they." The rhythm varies unpredictably between crisp, direct statements and meandering explanations.
"The vocabulary choices are generally precise—"sycophants" is particularly apt and memorable—though some phrases like "get the concept down; don't think too hard" feel slightly clunky in their construction."
This was the prompt I used:
"Imagine you're a literary critic. Critique the following comment based on use of language and effectiveness of communication only. Don't critique the argument itself:" followed by your comment.
"Image you're a ..." or "Act as a ..." tends to make a huge difference in the kind of output you get. If you put it in the role of a critic that people expect to be tough, you're less likely to get sycophantic responses, at least in my experience.
(If you want to see it get brutal, follow up the first response with a "be harsher" - it got unpleasantly savage)
Those critiques are nonsensical; the referents of "this" and "they" are completely obvious in the comment and the "clunky" construction is completely fine. There's also no "this" in the second sentence?
Exactly... What Claude output is what the prompt asked for: Critique. Even if there was nothing wrong at all Claude would still find something to critique—even if it has to hallucinate something wrong. That's it's job: To generate critique.
It's not really doing any sort of "deep analysis" it's just reading the text and comparing it to what it knows about similar texts from its model. It'll then predict/generate the next word based on prior critiques of similar writings. It doesn't really have an understanding of the text at all.
Most human criticism is not doing anything of deep analysis either, but just reading the text and comparing it to what we know about similar text.
> Even if there was nothing wrong at all Claude would still find something to critique
And the entire point was that you claimed 'Both ChatGPT and Claud always say something like, "a few grammar corrections are needed but this is excellent!"'.
Which clearly is not the case, as demonstrated. What you get out will depend on how much effort you're willing to put into prompting to specify the type of response you want, because they certainly lack "personality" and will try to please you. But that includes trying to please you when your prompt specifies how you want them to treat the input.
> It doesn't really have an understanding of the text at all.
This is a take that might have made sense a few years ago. It does not make sense with current models at all, and to me is a take that typically suggest a lack of experience with the models. Current models can in my experience e.g. often spot reasoning errors in text provided to them that the human writer of said text refuse to acknowledge is there.
I suggest you try to paste some bits of text into any of the major models and ask them to explain what they think the author might have meant. They don't get it perfectly right all of the time, but they can go quite in-depth and provide analysis that well exceeds what a lot of people would manage.
2 replies →
They could be better, but I don't agree they're nonsensical. The comment is fine as a comment, but when asked to give a literary critique pointing out that the language could be clearer is entirely reasonable.
But this is also besides the point, which is that it is trivial to get these models to argue - rightly or wrongly - that there are significant issues with your writing rather than claim everything is excellent.
Whether you can get critiques you agree are reasonable is another matter.
Okay but if it's just saying bullshit, it's useless as a critic.
3 replies →