Comment by simonw
1 day ago
308 posts on AI ethics: https://simonwillison.net/tags/ai-ethics/
52 on AI misuse: https://simonwillison.net/tags/ai-misuse/
149 on the unsolved challenge of prompt injection: https://simonwillison.net/tags/prompt-injection/
40 on slop: https://simonwillison.net/tags/slop/
If you want an "LLM evangelism blog that rarely, if ever, has any critical analysis that isn’t pro-industry" there are plenty out there. I'm not one of them.
People are confusing "excitement" with "evangelism". Your blog is definitely on the pro-AI side of things, but as you say, it's not one-sided or uncritical.
No offense, I’ve read nearly all of your posts and the criticisms of LLMs are still pretty heavily steeped in enthusiasm. And a lot of your short “criticism” pieces are nothingburgers other than detailing some incident (which I would categorize as blogspam).
I think you should highlight your exemplary pre-AI writing too.
All of these are about AI misuse, not skepticism of AI. By skepticism I mean doubting whether AI actually delivers on its promises which, based on this last post, sounds like something you think we're already past.
Many people still think AI coding agents are slop on steroids despite all the current hype around AI actually shipping functional products.
Love when people say "its promises". What specifically are you disappointed with? Simon's posts are high quality and evidence driven. AI has already delivered an incredible amount. Read Epoch for industry trends and analyses, METR to, everything points to a pretty consistent picture.
"Many people still think AI coding agents are slop on steroids despite all the current hype around AI actually shipping functional products."
Oh yes, tons and tons, especially on HN. But the plural of anecdote is not data. Enterprise spend speaks for itself. You are using AI-coded functional products all the time. Do you want like a diff history for the Google codebase or something?
Tbf the OPs blog and comments (including their sibling to your comment) are also heavily anecdotal.
> I’ve called November 2025 the November inflection point because that was when GPT-5.1 and Opus 4.5, combined with their respective coding agent harnesses, got good—good enough that we’ve spent the last six months adapting to agent systems that can reliably get useful work done.
Claiming a grand inflection point based on your own personal usage is very anecdotal.
10 replies →
It's hard for me to write about skepticism that coding agents deliver on their promises when I've been using them daily and know, for an absolute fact, that they boost my own productivity.
(And that's after taking into account the METR paper that says engineers over-estimate their productivity with these tools.)
I have plenty of doubts about AI delivering on its promises outside of coding. I don't write about AGI because I think it's science-fiction hysteria. I write about slop precisely because it represents a mis-use of AI that demonstrates people completely misunderstanding what it's useful for.