PhysicsForums and the Dead Internet Theory

1 year ago (hallofdreams.org)

Something I'm increasingly noticing about LLM-generated content is that...nobody wants it.

(I mean "nobody" in the sense of "nobody likes Nickelback". ie, not literally nobody.)

If I want to talk to an AI, I can talk to an AI. If I'm reading a blog or a discussion forum, it's because I want to see writing by humans. I don't want to read a wall of copy+pasted LLM slop posted under a human's name.

I now spend dismaying amounts of time and energy avoiding LLM content on the web. When I read an article, I study the writing style, and if I detect ChatGPTese ("As we dive into the ever-evolving realm of...") I hit the back button. When I search for images, I use a wall of negative filters (-AI, -Midjourney, -StableDiffusion etc) to remove slop (which would otherwise be >50% of my results for some searches). Sometimes I filter searches to before 2022.

If Google added a global "remove generative content" filter that worked, I would click it and then never unclick it.

I don't think I'm alone. There has been research suggesting that users immediately dislike content they perceive as AI-created, regardless of its quality. This creates an incentive for publishers to "humanwash" AI-written content—to construct a fiction where a human is writing the LLM slop you're reading.

Falsifying timestamps and hijacking old accounts to do this is definitely something I haven't seen before.

  • 100%.

    So far (thankfully) I've noticed this stuff get voted down on social media but it is blowing my mind people think pasting in a ChatGPT response is productive.

    I've seen people on reddit say stuff like "I don't know but here's what ChatGPT said." Or worse, presenting ChatGPT copy-paste as their own. Its funny because you can tell, the text reads like an HR person wrote it.

    • I've noticed the opposite actually, clearly ChatGPT written posts on Reddit that get a ton of upvotes. I'm especially noticing it on niche subreddits.

      The ones that make me furious are on some of the mental health subreddits. People are asking for genuine support from other people, but are getting AI slop instead. If someone needs support from an AI (which I've found can actually help), they can go use it themselves.

      1 reply →

    • I think some of that is the gamification of social media. "I have 1200 posts and you only have 500" kind of stuff. It's much easier to win the volume game when you aren't actually writing them. This is just a more advanced version of people who just post "I agree" or "I don't know anything about this, but...[post something just to post something]".

    • It's particularly funny/annoying when they're convinced that the fact they got it from the "AI" makes it more likely to be correct than other commenters who actually know what the heck they're talking about.

      It makes me wonder how shallow a person's knowledge of all areas must be that they could use an LLM for more than a little while without encountering something where it is flagrantly wrong yet continued with its same tone of absolute confidence and authority. ... but it's mostly just a particularly aggressive form of Gell-Mann amnesia.

  • The problem with "provide LLM output as a service," which is more or less the best case scenario for the ChatGPT listicles that clutter my feed, is that if I wanted an LLM result...I could have just asked the LLM. There's maybe a tiny proposition if I didn't have access to a good model, but a static page that takes ten paragraphs to badly answer one question isn't really the form factor anyone prefers; the actual chatbot interface can present the information in the way that works best for me, versus the least common denominator listicle slop that tries to appeal to the widest possible audience.

    The other half of the problem is that rephrasing information doesn't actually introduce new information. If I'm looking for the kind of oil to use in my car or the recipe for blueberry muffins, I'm looking for something backed by actual data, to verify that the manufacturer said to use a particular grade of oil or for a recipe that someone has actually baked to verify that the results are as promised. I'm looking for more information than I can get from just reading the sources myself.

    Regurgitating text from other data sources mostly doesn't add anything to my life.

    • Rephrasing can be beneficial. It can make things clearer to understand and learn from. Like in math something like khan academy or the 3blue 1 brown YouTube channel isn't presenting anything new, just rephrasing math in a different way that makes it easier for some to understand.

      If llms could take the giant overwhelming manual in my car and get out the answer to what oil to use, that woukd be useful and not new information

      5 replies →

  • > If I'm reading a blog or a discussion forum, it's because I want to see writing by humans. I don't want to read a wall of copy+pasted LLM slop posted under a human's name.

    This reminds me of the time around ChatGPT 3's release where Hacker News's comments was filled with users saying "Here's what ChatGPT has to say about this"

    • Pepperidge Farm remembers a time where ChatGPT 2 made no claims about being a useful information lookup tool, but was a toy used to write sonnets, poems, and speeches "in the style of X"...

  • Yup, I'm the same, and I love my LLMs. They're fun and interesting to talk to and use, but it's obvious to everyone that they're not very reliable. If I think an article is LLM-generated, then the signal I'm getting is that the author is just as clueless as I am, and there's no way I can trust that any of the information is correct.

    • > but it's obvious to everyone that they're not very reliable.

      Hopefully to everyone on HN, but definitely not to everyone on the greater Internet. There are plenty of horror stories of people who apparently 100% blindly trust whatever ChatGPT says.

      3 replies →

  • I think a good comparison is when you go to a store and there are salesmen there. Nobody wants to talk to a salesman. They can almost never help a customer with any issue, since even an ignorant customer usually knows more about the products in the store than the salesmen. Most customers hate salesmen and a sustainable portion of customers choose to leave the store or not enter because of the salesmen, meaning the store loses income. Yet this has been going on forever. So just prepare for the worst when it comes to AI, because that's what you are going to get, and neither ethical sense, business sense or any rationality is going to stop companies from showing it down your throat. They don't give a damn if they will lose income or even bankrupt their companies, because annoying the customer is more important.

  • This has been a constant back and forth for me. My personal project https://golfcourse.wiki was built on the idea that I wanted to make a wiki for golf nerds because nobody pays attention to 95% of fun golf courses because those courses don't have a marketing department in touch with social media.

    I basically decided that using AI content would waste everyone's time. However, it's a real chicken-or-egg problem in content creation. Faking it to the point of project viability has been a real issue in the past (I remember the reddit founders talking about posting fake comments and posts from fake users to make it look like more people were using the product). AI is very tempting for something like this, especially when a lot of people just don't care.

    So far I've stuck to my guns, and think that the key to a course wiki is absolutely having locals insight into these courses, because the nuance is massive. At the same time, I'm trying to find ways that I can reduced the friction for contributions, and AI may end up being one way to do that.

    • This is a really interesting conundrum. And I'm a golfer, so...

      Of the top of my head I wonder if there's a way to have AI generate a summary from existing (on-line) information about a course with a very explicit "this is what AI says about this course" or some similar disclosure until you get 'real' local insight. No one could then say 'it's just AI slop', but you're still providing value as there's something about each course. As much as I personally have reservations about AI, I (personally, YMMV) am much more forgiving if you are explicit about what's AI and what's not and not trying to BS me.

      3 replies →

  • I do wonder how much of the push for LLM-integrated everything has taken this into account.

    The general trend of viewing LLM features as forced against users' will and the now widespread use of "slop" as a derogatory description seems to indicate the general public is less enthusiastic about these consumer advances than, say, programmers on HN.

    I use LLMs for programming (and a few other, general QA things before a search engine/wikipedia visit) but want them absolutely nowhere else (except CoPilot et al in certain editors)

  • Another trick I do is to scroll to the end, and see if the last paragraph is written as a neat conclusion with a hedge (i.e. "In short...", "Ultimately..."). I imagine it's a convention to push LLMs to terminate text generation, but boy is it information-free.

  • I can understand it for AI generated text, but I think there are a lot of people that like AI generated images. Image sites like get a ton of people that like AI generated images. Civitai gets a lot of engagement for AI generated images, but so do many other image sites.

    • People who submit blog posts here sure do love opening their blogs with AI image slop. I have taken to assuming that the text is also AI slop, and closing the tab and leaving a comment saying such.

      Sometimes this comment gets a ton of upvotes. Sometimes it gets indignant replies insisting it's real writing. I need to come up with a good standard response to the latter.

      7 replies →

    • I don’t understand the problem with AI generated images.

      (I very much would like any AI generated text to be marked as such, so I can set my trust accordingly)

      1 reply →

  • > (I mean "nobody" in the sense of "nobody likes Nickelback". ie, not literally nobody.)

    Reminds me of the old Yogi Berra quote: Nobody goes there anymore, its too crowded.

  • > If Google added a global "remove generative content" filter that worked, I would click it and then never unclick it.

    It's not just generated content. This problem has been around for years. For example, google a recipe. I don't think the incentives are there yet. At least not until Google search is so unusable that no one is buying their ads anymore. I suspect any business model rooted in advertising is doomed to the eventual enshitification of the product.

  • nobody wants to see other's ai generated images, but most people around me are drooling about generating stuff

    wait for the proof-of-humanity decade where you're paid to be here and slow and flawed

    • Most AI generated images are like most dreams: meaningful to you but not something other people have much interest it.

      Once you have people sorting through them, editing them, and so on the curation adds enough additional interest...and for many people what they get out of looking at a gallery of AI images is ideas for what prompts they want to try.

    • Most AI genetated visuals have a myriad of styles but you could mostly tell it’s something not seen before and thats what people may be drooling for. The same drooling happened for things that have finally found their utility after a long time and are we’re now used to. For example 20 years ago Photoshop filters were all the rage and you’d see them expressed out everywhere back then. I think this AI gen phase will lose interest/enthusiasm over time but will enter and stay in toolbox for the right things, whatever people decide to be then.

    • Re: proof-of-humanity... I'm looking forward to a Gattaca-like drop-of-blood port on the side of your computer, where you prick yourself everytime you want a single "certified human response" endorsement for an online comment.

  • I was googling a question about opengraph last week. so many useless AI drivel results now.

> It had fairly steady growth until 2012, before petering out throughout the 2010s and 2020s in lieu of more centralized sites like StackExchange, and by 2025, only a small community was left

This timeline tracks with my own blogging. Google slowly stopped ranking traditional forum posts and blogs as well around that time, regardless of quality, unless it was a “major”.

> But, unlike so many other fora from back in the early days, it went from 2003 to 2025 without ever changing its URLs, erasing its old posts, or going down altogether.

I can also confirm if you have a bookmark to my blog from 2008, that link will still work!

The CMS is no longer, it's all static now... which too few orgs take the short amount of time to bother with when "refreshing" their web presence :(

It is sad that this is happening to PhysicsForums. It was one of first websites I was using frequently 15 years ago when I started my physics passion (later career). I was active reader and contributed on few occasions but I still remember some members who I thought that one day I will be smart and knowledgeable like them. With years and the move to social media following Arab spring things started to change (as part of the overall transition from forum being the dominant place for discussions). But I stopped visiting it around 2018 unless I came through google search (later kagi). I still find the archive useful to answer some questions and I would disagree with author of article that because no one is sharing links on twitter that means no one care.

Talk about burying the lede! Near the bottom of the story the site owner confirms that it was him that added the backdated AI comments (perhaps it should have been obvious...)

  • The investigation of the events on this particular website are just a tool to illustrate a much broader point about internet content, identity and LLMs.

  • I couldn’t find it. He was trying to seed the site ?

    • This is what he replied when asked about the backdated comments:

      > The backdated answers were an internal test. We conceived of a bot that would provide a quality answer to a thread without a reply after 1+ years. That too also failed. Instead, I’m considering pruning all threads without a reply as they clutter up the forums.

      2 replies →

    • Experimenting with using AI bots to respond to questions that had been open for a long time with no response.

Money quote:

> There’s also a social contract: when we create an account in an online community, we do it with the expectation that people we are going to interact with are primarily people. Oh, there will be shills, and bots, and advertisers, but the agreement between the users and the community provider is that they are going to try to defend us from that, and that in exchange we will provide our engagement and content. This is why the recent experiments from Meta with AI generated users are both ridiculous and sickening. When you might be interacting with something masquerading as a human, providing at best, tepid garbage, the value of human interaction via the internet is lost.

It is a disaster. I have no idea how to solve this issue, I can't see a future where artificially generated slop doesn't eventually overwhelm every part of the internet and make it unusable. The UGC era of the internet is probably over.

  • Oh, there are solutions. One is a kind of a socialized trust system. I know that Lyn Alden that I follow on Nostr is actually her not only because she says so, but also because a bunch of other people follow her too. There are bot accounts that impersonate her, but it’s easy to block those, a it’s pretty obvious from the follower count. And once you know a public key that Lyn posts under, I’m sure it’s her.

    She could start posting LLM nonsense, but people will be quick to point it out, and start unfollowing. An important part is that there’s no algorithm deciding what I see in my feed (unless I choose so), so random LLM stuff can’t really get into my feed, unless I chose so.

    Another option is zero knowledge identity proofs that can be used to attest that you’re a human without exposing PII, or relying on a some centralized server being up to “sign you in on your behalf”

    https://zksync.mirror.xyz/kWRhD81C7il4YWGrkDplfhIZcmViisRe3l...

  • Well, the end of open, public UGC content anyway.

    I have heard of Discord servers where admins won't assign you roles giving you access to all channels unless you've personally met them, someone in the group can vouch for you, or you have a video chat with them and "verify."

    This is the future. We need something like Discord that also has a webpage-like mechanism built into it (a space for a whole collection of documents, not just posts) and is accessible via a browser.

    Of course, depending on discovery mechanisms, this means this new "Internet" is no longer an easy escape from a given reality or place, and that was a major driver of its use in the 90's and 00's - curious people wanting to explore new things not available in their local communities. To be honest, the old, reliable Google was probably the major driver of that.

    And it sucks for truly anti-social people who simply don't want to deal with other people for anything, but maybe those types will flourish with AI everywhere.

    If the gated hubs of a possible new group-x-group human Internet maintain open lobbies, maybe the best of both worlds can be had.

    • This strange reliance on Discord as some sort of "escape from web3.0" is silly to anyone who knows what Discord is(modern AOL) and how centralized it is. Its just the same corporate walled garden with more echochambery isolation.

      1 reply →

  • Invite only forums or forums with actual identity checking of some sort. Google and Facebook are in prime position to actually provide real online identity services to other websites, which makes Facebook itself developing bots even funnier. Maybe we'll eventually get bank/government issued online identity verification.

    • Online identity verification is the obvious solution, the only problem is that we would lose the last bits of privacy we have on the internet. I guess if everyone was forced to post under our real name and identity, we might treat each other with better etiquette, but...

      8 replies →

    • > with actual identity checking of some sort

      I am hoping OpenID4VCI[0] will fill this role. It looks to be flexible enough to preserve public privacy on forums while still verifying you are the holder of a credential issued to a person. The credential could be issued from an issuer that can verify you are an adult (banks) for example. Then a site or forum etc, that works with a verifier that can verify whatever combination of data of one or more credentials presented. I haven't dug into the full details of implementation and am skimming over a lot but that appears to be the gist of it.

      [0] https://openid.net/specs/openid-4-verifiable-credential-issu...

  • Ironically, on Facebook itself I am only friends with people I actually know in real life. So, most of the stuff I see in my feed is from them.

    • I’m only friends with people I know on Facebook, so I’m mostly see ads on that site. There’s a feed to just see stuff your friends post, but for some reason the site defaults to this awful garbage ad spam feed (no surprise really).

      2 replies →

  • I suspect that the honest outcome will be that platforms where AI content is allowed/encouraged will begin to appear like a video game. If everyone in school is ai-media famous - then no one is. There is most assuredly a market for a game where you are immediately influencer famous, but it's certainly much smaller than the market for social media.

  • For the tech discussions I'm interested in burning cpu/GPU cycles for proof of work is a good way to make replies expensive enough that only people who care will post then.

    Another option is a web of trust.

    It's finally the year of gpg!

  • If you think about historical parallels like advertising and the industrialisation of entertainment, where the communication is sufficiently human-like to function but deeply insincere and manipulative, I think you'll find that you absolutely can see such a future and how it might turn out.

    A lot or most of people will adapt, accept these conditions because compared to the constant threat of misery and precarity of work, or whatever other way to sustenance and housing, it will be very tolerable. Similar to how so called talk shows flourished, where fake personas pretend to get to know other fake personas they are already very well acquainted with and so on, while selling slop, anxieties or something. Like Oprah, the billionaire.

I don't quite understand the issue of "back-dating" or hijacking accounts. How is this being done exactly? I came away from this article wondering if I was missing something.

  • The last section mentions that the PhysicsForums admins are experimenting with LLM-generated responses, so I think the site owners are responsible.

    > We reached out to Greg Bernhardt asking for comment on LLM usage in PhysicsForums, and he replied:

    > "We have many AI tests in the works to add value to the community. I sent out a 2024 feedback form to members a few weeks ago and many members don’t want AI features. We’ll either work with members to dramatically improve them or end up removing them. We experimented with AI answers from test accounts last year but they were not meeting quality standards. If you find any test accounts we missed, please let me know. My documentation was not the best."

    Why they would recycle old human accounts as AI "test accounts", I have no idea.

  • > How is this being done exactly?

    Presumably it's being done by the site-owner, whether that means new-management or original management getting desperate/greedy.

    • Oh that's so disappointing to hear about PhysicsForums. Thanks for the answer to you, and the others who replied.

  • Wondering the same. I couldn't make it through the article. Fascinating discovery, but poorly written and difficult to navigate the author's thoughts. The interstitial quotes were particularly disorienting.

I like the assumption that it was a real account originally.

It all seems so unthinkable but when running a forum or a blog with an active comment section.. what would you do/think if your users show up, browse around and not say anything for a week? You start out by making topics in your own name, write helpful replies.. until you look like an idiot talking to yourself.

Forums with good traffic and lots of spammy advertisement no doubt consider it when visitors leave because nothing new happened.

I once upon a time, on a rather stale forum, created two similarly named accounts from the same ip and argued with myself. At first I thought the owner or one of the other users would notice but I quickly learned that no behaviour is weird enough for it to be ever considered.

  • I think I would rather post alone than my current experience (on 2 forums already) of other posters being overwhelmingly spam bots.

    • A blog is more suitable for that. You could mirror the posts on a forum and look sophisticated.

Ooof. The idea--or reality--that humans' accounts would be hijacked by site-owners to make impersonating slop (presumably to bring in ad-revenue) is somehow both infuriating and energy-sapping-depressing.

Issues of trust and attribution have always existed for the web, but for many reasons it feels so much worse now--how bad must it get before some kind of sea-change can occur?

I'm not sure what the solution would be here.

* Does one need to establish a friggin' trademark for their own name/handle [0], just so they can threaten to sue using money they probably don't have?

* Is it finally time for PKI and everybody signs their posts with private keys and wastes CPU cycles verifying signatures to check for impersonation?

* Is there some set of implied collective expectations which need to be captured and formalized into the global patchworks of law?

[0] Ex: By establishing a small but plausible "business" selling advice and opinions under that name, and going after the impersonator for harming that brand.

  • Impersonating somebody to make it look like they said something they didn’t really ought to be considered defamation or something.

    Also there’s something really uncomfortable about the phrasing of a lot of those answers. I mean, even as somebody with an engineering degree, I try not to ever answer a question “as a <field> engineer” because when screwing around online I haven’t done the correct amount of analysis to provide answers “as an engineer” ethically (acknowledging the irony of using the phrase here, but, clearly this is not a technical statement so I think it is fine). The bot doesn’t seem to have this compunction.

    This ravenprp guy was an engineering student a couple years ago. I guess it’s less of a thing because he wasn’t commenting under his real name. But it seems like this site, given the type of content it hosts, could easily end up impersonating somebody “as an engineer” in the field they work and have a professional reputation in. And the site even has a historical record of them asking and answering questions through their education, so it does a really good job of misleading people into thinking an engineer is answering their questions.

    I know the idea of an individual professional reputation has taken a beating in the modern hyper-corporate world. But the more I think of it, the more I think… this seems incredibly shitty and actually borderline dangerous, right?

  • It is sad. I have been putting a copyright notice on my resume at the bottom to prevent some nonsense.

    I have always wondered if people could attach some sort of cryptographic marker to their posts, that could link to an archive somewhere. Mostly I was thinking of backups of posts to yelp that couldn't be taken down, but I wonder if it would work that posts someone never made.

    • > I have been putting a copyright notice on my resume at the bottom to prevent some nonsense.

      I expect the bad-actors will feed it into an LLM and say: "Rephrase this slightly", and they will get away with it because the big-money hucksters will have already convinced courts to declare it transformative or fair-use.

  • Shouldn't we invent a protocol that keeps the content you produce under your control so that places like forums or facebook are only discovery devices and interaction facilitators, but not custodians of all communication? Being able to independantly reach the source of piece of information is increasingly important.

  • I exchange public keys with close friends in person. A large scale solution would be very Orwellian. You would need a national ID that is a smart card to connect to an ISP and possible biometric verification.

    • We already have e-passports and zero knowledge proofs to show you have one without revealing who you are.

      If all else fails, there is always the web of trust (i think web of trust has a lot of issues, but establishing soneone is human seems like a much lower bar than establishing identity)

      3 replies →

    • Could I buy a physical device like RSA SecurID from my bank branch or post office and log into a closed VPN-like network where all the servers are run by verified users? I know there are problems with that idea.

  • Don't sign your posts!

    • Are you saying nothing should be key-signed because you want some kind of deniability later?

      Or do you mean people should avoid using an pseudonym in favor of posts that are anonymous, so that there's never any created identity to exploit/defend?

      1 reply →

The ShackNews forum: https://www.shacknews.com/chatty was similar - go back in time on on it and you can find posts about 9/11 unfolding.

  • Ars Technica started with comms forums + this new idea to report tech news. The forums are still there but not nearly the camaraderie of the early days.

    • > The forums are still there but not nearly the camaraderie of the early days.

      I remember visiting those forums when I was young and feeling like part of a big group of friendly people hanging out online together.

      I tried creating a new account recently and it had a very different vibe. Felt like the old guard had been established and the forums I looked at were dominated by a couple of posters who just wanted to talk, but not discuss anything.

      Some of the post counts of those people were eye-watering.

      2 replies →

    • Same with siliconinvestor.com

      It was an early stock discussion forum. It grew rapidly when search engines started indexing everything and this forum had a URL for each message that was easily indexable.

      It's still around, but nothing like the old days.

      2 replies →

    • It's been a long time since I visited the Ars forums, but the news article commenters today are absolutely deranged. It makes me want to not engage with the forums again.

      2 replies →

A trajectory observation: People from the US and maybe Europe seem love having all kinds of niche forums, like PhysicsForum. On the contrary, Chinese people seemed love having a centralized place. Case in point, Zhihu(zhihu.com) was a clone of Quora, yet now Zhihu is the largest site for finding deep discussion of practically everything, maths, machine learning, history, physics, engineering, general science, just name a few. Tons of researchers and experts and hobbyists are there to share their insights. In contrasts, the quality of Quora seems have been deteriorating over the years, and most of the experts are in all kinds of niche forums anyway.

I wonder why there is such a difference.

I'm also noticing there's a dead economy theory or at least dead job market theory. AI sending resumes, AI reading/rejecting resumes. Humans wanting to talk to real humans, but all we get is slop. If we ever develop AGI I guess we'll have to go back to completely offline just to talk to real humans.

It is interesting, that this forum might be „AI poisoned” for other AI bots, because training AI on content generated by AI = garbage.

Dead internet theory is one of those ideas that keeps resurfacing or being revied with articles like this, even though the evidence is only limited to confirmation bias. It ignores that there are huge parts of the internet that are not dead. I think it's more like the quality of discourse has fallen for reasons that are not clear.

  • The article looked at the PhysicsForums and found that 92% of the text is AI or machine generated...

    • The internet is way bigger than PhysicsForums. That was my point, but your response seems to confirm what I said about discourse declining though.

I suspect a manager i work with has started using llms, i’ve sat next to him long enough before he went manager to know he’s incompetent, and now in chat, suddenly he spouts out of character plausible, but off for him explanations. I work in a company where for most english is not their first language so i’m not sure if anyone else has picked up on it.