Comment by COAGULOPATH
1 year ago
Something I'm increasingly noticing about LLM-generated content is that...nobody wants it.
(I mean "nobody" in the sense of "nobody likes Nickelback". ie, not literally nobody.)
If I want to talk to an AI, I can talk to an AI. If I'm reading a blog or a discussion forum, it's because I want to see writing by humans. I don't want to read a wall of copy+pasted LLM slop posted under a human's name.
I now spend dismaying amounts of time and energy avoiding LLM content on the web. When I read an article, I study the writing style, and if I detect ChatGPTese ("As we dive into the ever-evolving realm of...") I hit the back button. When I search for images, I use a wall of negative filters (-AI, -Midjourney, -StableDiffusion etc) to remove slop (which would otherwise be >50% of my results for some searches). Sometimes I filter searches to before 2022.
If Google added a global "remove generative content" filter that worked, I would click it and then never unclick it.
I don't think I'm alone. There has been research suggesting that users immediately dislike content they perceive as AI-created, regardless of its quality. This creates an incentive for publishers to "humanwash" AI-written content—to construct a fiction where a human is writing the LLM slop you're reading.
Falsifying timestamps and hijacking old accounts to do this is definitely something I haven't seen before.
100%.
So far (thankfully) I've noticed this stuff get voted down on social media but it is blowing my mind people think pasting in a ChatGPT response is productive.
I've seen people on reddit say stuff like "I don't know but here's what ChatGPT said." Or worse, presenting ChatGPT copy-paste as their own. Its funny because you can tell, the text reads like an HR person wrote it.
I've noticed the opposite actually, clearly ChatGPT written posts on Reddit that get a ton of upvotes. I'm especially noticing it on niche subreddits.
The ones that make me furious are on some of the mental health subreddits. People are asking for genuine support from other people, but are getting AI slop instead. If someone needs support from an AI (which I've found can actually help), they can go use it themselves.
> clearly ChatGPT written posts on Reddit
You should have a lot less confidence in your ability to discern what's AI generated content, honestly. Especially in such contexts where the humans will likely be writing very non-offensive in order to not-trigger the OP.
I think some of that is the gamification of social media. "I have 1200 posts and you only have 500" kind of stuff. It's much easier to win the volume game when you aren't actually writing them. This is just a more advanced version of people who just post "I agree" or "I don't know anything about this, but...[post something just to post something]".
It's particularly funny/annoying when they're convinced that the fact they got it from the "AI" makes it more likely to be correct than other commenters who actually know what the heck they're talking about.
It makes me wonder how shallow a person's knowledge of all areas must be that they could use an LLM for more than a little while without encountering something where it is flagrantly wrong yet continued with its same tone of absolute confidence and authority. ... but it's mostly just a particularly aggressive form of Gell-Mann amnesia.
The problem with "provide LLM output as a service," which is more or less the best case scenario for the ChatGPT listicles that clutter my feed, is that if I wanted an LLM result...I could have just asked the LLM. There's maybe a tiny proposition if I didn't have access to a good model, but a static page that takes ten paragraphs to badly answer one question isn't really the form factor anyone prefers; the actual chatbot interface can present the information in the way that works best for me, versus the least common denominator listicle slop that tries to appeal to the widest possible audience.
The other half of the problem is that rephrasing information doesn't actually introduce new information. If I'm looking for the kind of oil to use in my car or the recipe for blueberry muffins, I'm looking for something backed by actual data, to verify that the manufacturer said to use a particular grade of oil or for a recipe that someone has actually baked to verify that the results are as promised. I'm looking for more information than I can get from just reading the sources myself.
Regurgitating text from other data sources mostly doesn't add anything to my life.
Rephrasing can be beneficial. It can make things clearer to understand and learn from. Like in math something like khan academy or the 3blue 1 brown YouTube channel isn't presenting anything new, just rephrasing math in a different way that makes it easier for some to understand.
If llms could take the giant overwhelming manual in my car and get out the answer to what oil to use, that woukd be useful and not new information
I have to protest. A lot of 3b1b is new. Not the math itself, but the animated graphical presentation is. That's where the value from his channel comes in. He provides a lot of tools to visualize problems in ways that haven't been done before.
1 reply →
>If llms could take the giant overwhelming manual in my car and get out the answer to what oil to use, that woukd be useful and not new information
You can literally just google that or use the appendix that's probably at the back of the manual. It's also probably stamped on the engine oil cap. It also probably doesn't matter and you can just use 10w40.
2 replies →
> If I'm reading a blog or a discussion forum, it's because I want to see writing by humans. I don't want to read a wall of copy+pasted LLM slop posted under a human's name.
This reminds me of the time around ChatGPT 3's release where Hacker News's comments was filled with users saying "Here's what ChatGPT has to say about this"
Pepperidge Farm remembers a time where ChatGPT 2 made no claims about being a useful information lookup tool, but was a toy used to write sonnets, poems, and speeches "in the style of X"...
[flagged]
Yup, I'm the same, and I love my LLMs. They're fun and interesting to talk to and use, but it's obvious to everyone that they're not very reliable. If I think an article is LLM-generated, then the signal I'm getting is that the author is just as clueless as I am, and there's no way I can trust that any of the information is correct.
> but it's obvious to everyone that they're not very reliable.
Hopefully to everyone on HN, but definitely not to everyone on the greater Internet. There are plenty of horror stories of people who apparently 100% blindly trust whatever ChatGPT says.
I was especially horrified/amused when students started turning in generated answers and essays, and /r/teaching learned that you could "ask chatgpt if it wrote the essay and it will tell you."
It makes perfect intuitive sense if you don't know how the things actually work.
Yeah that's fair, I suppose I see that sort of thing on reddit fairly regularly, especially in the "here's a story about my messed-up life" types of subreddits.
1 reply →
I think a good comparison is when you go to a store and there are salesmen there. Nobody wants to talk to a salesman. They can almost never help a customer with any issue, since even an ignorant customer usually knows more about the products in the store than the salesmen. Most customers hate salesmen and a sustainable portion of customers choose to leave the store or not enter because of the salesmen, meaning the store loses income. Yet this has been going on forever. So just prepare for the worst when it comes to AI, because that's what you are going to get, and neither ethical sense, business sense or any rationality is going to stop companies from showing it down your throat. They don't give a damn if they will lose income or even bankrupt their companies, because annoying the customer is more important.
This has been a constant back and forth for me. My personal project https://golfcourse.wiki was built on the idea that I wanted to make a wiki for golf nerds because nobody pays attention to 95% of fun golf courses because those courses don't have a marketing department in touch with social media.
I basically decided that using AI content would waste everyone's time. However, it's a real chicken-or-egg problem in content creation. Faking it to the point of project viability has been a real issue in the past (I remember the reddit founders talking about posting fake comments and posts from fake users to make it look like more people were using the product). AI is very tempting for something like this, especially when a lot of people just don't care.
So far I've stuck to my guns, and think that the key to a course wiki is absolutely having locals insight into these courses, because the nuance is massive. At the same time, I'm trying to find ways that I can reduced the friction for contributions, and AI may end up being one way to do that.
This is a really interesting conundrum. And I'm a golfer, so...
Of the top of my head I wonder if there's a way to have AI generate a summary from existing (on-line) information about a course with a very explicit "this is what AI says about this course" or some similar disclosure until you get 'real' local insight. No one could then say 'it's just AI slop', but you're still providing value as there's something about each course. As much as I personally have reservations about AI, I (personally, YMMV) am much more forgiving if you are explicit about what's AI and what's not and not trying to BS me.
This is a good suggestion, and I'll think long and hard about it. My biggest concern is that the type of people who would contribute to such a public project are the type of folks who would be offended at the use of AI in general. That concern, again, leads me back to the conundrum of what to do.
I've always insisted that if it is financially feasible, I'd want the app to become a 501(c)(3) or at least a B-Corp, maybe even sold to Wikimedia. Still, the number of people who contribute to the side vs the number who visit is somewhere in the range of 1:10,000 (if that) right now, so concern about offending contributors is non-trivial.
As it stands, I've generally gone to the courses' sites and just quoted what they have to say about their own course, but that really isn't what I want to do, even if it is generally informative. Unfortunately, there is rarely hole-by-hole information, which is the level of granularity I'm going for.
But AI will just summarize v other humans’ work here. It has no understanding of golf…
1 reply →
I do wonder how much of the push for LLM-integrated everything has taken this into account.
The general trend of viewing LLM features as forced against users' will and the now widespread use of "slop" as a derogatory description seems to indicate the general public is less enthusiastic about these consumer advances than, say, programmers on HN.
I use LLMs for programming (and a few other, general QA things before a search engine/wikipedia visit) but want them absolutely nowhere else (except CoPilot et al in certain editors)
Another trick I do is to scroll to the end, and see if the last paragraph is written as a neat conclusion with a hedge (i.e. "In short...", "Ultimately..."). I imagine it's a convention to push LLMs to terminate text generation, but boy is it information-free.
I can understand it for AI generated text, but I think there are a lot of people that like AI generated images. Image sites like get a ton of people that like AI generated images. Civitai gets a lot of engagement for AI generated images, but so do many other image sites.
People who submit blog posts here sure do love opening their blogs with AI image slop. I have taken to assuming that the text is also AI slop, and closing the tab and leaving a comment saying such.
Sometimes this comment gets a ton of upvotes. Sometimes it gets indignant replies insisting it's real writing. I need to come up with a good standard response to the latter.
> People who submit blog posts here sure do love opening their blogs with AI image slop.
It sucks, but it doesn't suck any more than what was done in the past: Litter the article with stock photos.
Either have a relevant photo (and no, a post about cooking showing an image of a random kitchen, set of dishes, or prepared food does not count), or don't have any.
The only reason blog posts/articles had barely relevant stock images was to get people's attention. Is it any worse now that they're using AI generated images?
> I need to come up with a good standard response to the latter.
How about, "I'm sorry, but if you're willing to use AI image slop, how should I know you wouldn't also use AI text slop? AI text content isn't reliable, and I don't have time to personally vet every assertion."
4 replies →
I don’t understand the problem with AI generated images.
(I very much would like any AI generated text to be marked as such, so I can set my trust accordingly)
> I don’t understand the problem with AI generated images.
Depends on what they are used for and what they are purporting to represent.
For example, I really hate AI images being put into kids books, especially when they are trying to be psuedo-educational. A big problem those images have is from one prompt to the next, it's basically impossible to get consistent designs which means any sort of narrative story will end up with pages of characters that don't look the same.
Then there's the problem that some people are trying to sell and pump this shit like crazy into amazon. Which creates a lot of trash books that squeeze out legitimate lesser known authors and illustrators in favor of this pure garbage.
Quite similar to how you can't really buy general products from amazon because drop shipping has flooded the market with 10 billion items with different brands that are ultimately the same wish garbage.
The images can look interesting sometimes, but often on second glance there's just something "off" about the image. Fingers are currently the best sign that things have gone off the rails.
Despite what people think there is a sort of art to getting interesting images out of an ai model.
That’s not the issue though, it should be marked as such or be found in a section people looking for it can easily find it instead of shoving it everywhere. To me placing that generated content in human spaces is a strong signal for low effort. On the other hand generated content can be extremely interesting and useful and indeed there’s an art to it
2 replies →
> (I mean "nobody" in the sense of "nobody likes Nickelback". ie, not literally nobody.)
Reminds me of the old Yogi Berra quote: Nobody goes there anymore, its too crowded.
Exactly. Why in the hell would I want someone to use ChatGPT for me? If I wanted that, I could go use that instead.
I believe most times such responses are made in assumption that people are just lazy, like we used provide links to https://letmegooglethat.com/ before.
[dead]
> If Google added a global "remove generative content" filter that worked, I would click it and then never unclick it.
It's not just generated content. This problem has been around for years. For example, google a recipe. I don't think the incentives are there yet. At least not until Google search is so unusable that no one is buying their ads anymore. I suspect any business model rooted in advertising is doomed to the eventual enshitification of the product.
nobody wants to see other's ai generated images, but most people around me are drooling about generating stuff
wait for the proof-of-humanity decade where you're paid to be here and slow and flawed
Most AI generated images are like most dreams: meaningful to you but not something other people have much interest it.
Once you have people sorting through them, editing them, and so on the curation adds enough additional interest...and for many people what they get out of looking at a gallery of AI images is ideas for what prompts they want to try.
Most AI genetated visuals have a myriad of styles but you could mostly tell it’s something not seen before and thats what people may be drooling for. The same drooling happened for things that have finally found their utility after a long time and are we’re now used to. For example 20 years ago Photoshop filters were all the rage and you’d see them expressed out everywhere back then. I think this AI gen phase will lose interest/enthusiasm over time but will enter and stay in toolbox for the right things, whatever people decide to be then.
Re: proof-of-humanity... I'm looking forward to a Gattaca-like drop-of-blood port on the side of your computer, where you prick yourself everytime you want a single "certified human response" endorsement for an online comment.
Much of the benefit I get from LLMs is, ironically, avoiding LLM output from the results of web searches.
How do you do that, if you don't mind sharing?
It might not be what you're hoping for, but just doing a question instead of a web search. It's often more useful to get even a hallucinatory answer compared to affiliate marketing listicles that are all coming from smaller models anyway.
I was googling a question about opengraph last week. so many useless AI drivel results now.