← Back to context

Comment by Tepix

1 month ago

Well, case in point:

If you ask an AI to grade an essay, it will grade the essay highest that it wrote itself.

Is this true though? I haven't done the experiment, but I can envision the LLM critiquing its own output (if it was created in a different session) and iteratively correcting it and always finding flaws in it. Are LLMs even primed to say "this is perfect and it needs no further improvements"?

What I have seen is ChatGPT and Claude battling it out, always correcting and finding fault with each other's output (trying to solve the same problem). It's hilarious.