← Back to context

Comment by thesz

19 days ago

As far as I remember, article stated that he found same problematic behavior for many prompts, issued by him and his colleagues. The "stupid prompt" in article is for demonstration purposes.

But that’s not an argument, that’s just assertion, and it’s directly contradicted by all the more rigorous attempts to do the same thing through benchmarks (public and private).