Search tool that only returns content created before ChatGPT's public release

2 months ago (tegabrain.com)

> This is a search tool that will only return content created before ChatGPT's first public release on November 30, 2022.

The problem is that Google's search engine - but, oddly enough, ALL search engines - got worse before that already. I noticed that search engines got worse several years before 2022. So, AI further decreased the quality, but the quality had a downwards trend already, as it was. There are some attempts to analyse this on youtube (also owned by Google - Google ruins our digital world); some explanations made sense to me, but even then I am not 100% certain why Google decided to ruin google search.

One key observation I made was that the youtube search, was copied onto Google's regular search, which makes no sense for google search. If I casually search for a video on youtube, I may be semi-interested in unrelated videos. But if I search on Google search for specific terms, I am not interested in crap such as "others also searched for xyz" - that is just ruining the UI with irrelevant information. This is not the only example, Google made the search results worse here and tries to confuse the user in clicking on things. Plus placement of ads. The quality really worsened.

  • Are you aware of Kagi (kagi.com)?

    With them, at least the AI stuff can be turned off.

    Membership is presently about 61k, and seems to be growing about 2k per month: https://kagi.com/stats

  • There is also the fact that automatically generated content predates ChatGPT by a lot. By around 2020 most Google searches already returned lots of SEO-optimized pages made from scrapped content or keyword soups made by rudimentary language models or markov chains.

    • Well there's also the fact that GPT-3 API was released in June 2020 and its writing capabilities were essentially on par with ChatGPT initial release. It was just a bit harder to use, because it wasn't yet trained to follow instructions, it only worked as a very good "autocomplete" model, so prompting was a bit "different" and you couldn't do stuff like "rewrite this existing article in your own words" at all, but if you just wanted to write some bullshit SEO spam from scratch it was already as good as ChatGPT would be 2 years later.

      2 replies →

    • It was popular way before 2020 but Google managed to keep up with SEO tricks for good decade+ before. Guess it got to breaking point.

  • > if I search on Google search for specific terms, I am not interested in crap such as "others also searched for xyz" - that is just ruining the UI with irrelevant information

    You assume the aim here is for you to find relevant information, not increase user retention time. (I just love the corporate speak for making people's lives worse in various ways.)

    • You finding relevant information used to be the aim. Enshittification started when they let go of that aim.

  • Honestly the biggest failing is just SEO spam sites got too good at defeating the algorithm. The amount of bloody listicles or quora nonsense or backlink farming websties that come up in search is crazy.

    • For most commercial related terms, I suspect if you got rid of all “spanmy” results you would be left with almost nothing. No independent blogger is gonna write about the best credit card with travel points.

      5 replies →

    • This is bullshit the search engines want you to believe. It's trivial to detect sites that "defeat" the algorithm; you simply detect their incentives (ads/affiliate links) instead.

      Problem is that no mainstream search engine will do it because they happen to also be in the ad business and wouldn't want to reduce their own revenue stream.

  • I've been using DuckDuckGo for the last... decade or so. And it still seems to return fairly relevant documentation towards the top.

    To be fair, that's most of what I use search for these days is "<<Programming Language | Tool | Library | or whatever>> <<keyword | function | package>>" then navigate to the documentation, double check the versions align with what I'm writing software in, read... move on.

    Sometimes I also search for "movie showtimes nyc" or for a specific venue or something.

    So maybe my use cases are too specific to screw up, who knows. If not, maybe DDG is worth a try.

  • Significant changes were made to Google and YouTube in 2016 and 2017 in response to the US election. The changes provided more editorial and reputation based filtering, over best content matching.

  • > The problem

    That's a separate problem. The search algorithm applied on top of the underlying content is a separate problem from the quality or origin of the underlying content, in aggregate.

  • Counterpoint: The experience of quickly finding succinct accurate responses to queries has never been better.

    Years ago, I would consider a search "failed" if the page with related information wasn't somewhere in the top 10. Now a search is "failed" if the AI answer doesn't give me exactly what I'm looking for directly.

  • > I am not 100% certain why Google decided to ruin google search.

    Ask Prabhakar Raghavan. Bet he knows.

  • The problem is that before Nov 30, 2022 we also had plenty of human-generated slop bearing down on the web. SEO content specifically.

  • the main theory is that with bad results you have to search more and get more engaged in ads so more revenue for google. Its enshitification

somebody said once we are mining "low-background tokens" like we are mining low-background (radiation) steel post WW2 and i couldnt shake the concept out of my head

(wrote up in https://www.latent.space/i/139368545/the-concept-of-low-back... - but ironically repeating something somebody else said online is kinda what i'm willingly participating in, and it's unclear why human-origin tokens should be that much higher signal than ai-origin ones)

  • Low background steel is no longer necessary.

    "...began to fall in 1963, when the Partial Nuclear Test Ban Treaty was enacted, and by 2008 it had decreased to only 0.005 mSv/yr above natural levels. This has made special low-background steel no longer necessary for most radiation-sensitive uses, as new steel now has a low enough radioactive signature."

    https://en.wikipedia.org/wiki/Low-background_steel

    • Interesting. I guess that analogously, we might find that X years after some future AI content production ban, we could similarly start ignoring the low background token issue?

      9 replies →

  • every human generation built upon the slop of the previous one

    but we appreciated that, we called it "standing on the shoulders of giants"

    • > we called it "standing on the shoulders of giants"

      We do not see nearly so far though.

      Because these days we are standing on the shoulders of giants that have been put into a blender and ground down into a slippery pink paste and levelled out to a statistically typical 7.3mm high layer of goo.

      3 replies →

    • This sounds like an Alan Kay quote. He meant that in regards to useful inventions. AI-generated spam just decreases the quality. We'd need a real alternative to this garbage from Google but all the other search engines are also bad. And their UI is also horrible - not as bad as Google, but also bad. Qwant just tries to copy/paste Google for instance (though interestingly enough, sometimes it has better results than Google - but also fewer in general, even ignornig false positive results).

      1 reply →

    • We have two optimization mechanisms though which reduce noise with respect to their optimization functions: evolution and science. They are implicitly part of "standing on the shoulders of giants", you pick the giant to stand on (or it is picked for you).

      Whether or not the optimization functions align with human survival, and thus our whole existence is not a slop, we're about to find out.

    • There's a reason this is comedy:

        Listen, lad. I built this kingdom up from nothing. When I started here, all there was was swamp. Other kings said I was daft to build a castle on a swamp, but I built it all the same, just to show 'em. It sank into the swamp. So, I built a second one. That sank into the swamp. So, I built a third one. That burned down, fell over, then sank into the swamp, but the fourth one... stayed up! And that's what you're gonna get, lad: the strongest castle in these islands.
      

      While this is religious:

        [24] “Everyone then who hears these words of mine and does them will be like a wise man who built his house on the rock. [25] And the rain fell, and the floods came, and the winds blew and beat on that house, but it did not fall, because it had been founded on the rock. [26] And everyone who hears these words of mine and does not do them will be like a foolish man who built his house on the sand. [27] And the rain fell, and the floods came, and the winds blew and beat against that house, and it fell, and great was the fall of it.”
      

      Humans build not on each other's slop, but on each other's success.

      Capitalism, freedom of expression, the marketplace of ideas, democracy: at their best these things are ways to bend the wisdom of the crowds (such as it is) to the benefit of all; and their failures are when crowds are not wise.

      The "slop" of capitalism is polluted skies, soil and water, are wage slaves and fast fashion that barely lasts one use, and are the reason why workplace health and safety rules are written in blood. The "slop" of freedom of expression includes dishonest marketing, libel, slander, and propaganda. The "slop" of democracy is populists promising everything to everyone with no way to deliver it all. The "slop" of the marketplace of ideas is every idiot demanding their own un-informed rambling be given the same weight as the considered opinions of experts.

      None of these things contributed our social, technological, or economic advancement, they are simply things which happened at the same time.

      AI has stuff to contribute, but using it to make an endless feed of mediocrity is not it. The flood of low-effort GenAI stuff filling feeds and drowning signal with noise, as others have said: just give us your prompt.

    • Because the pyramids, the theory of general relativity and the Linux kernel are all totally comparable to ChatGPT output. /s

      Why is anybody still surprised that the AI bubble made it that big?

      10 replies →

Somewhat related, the leaderboard of em-dash users on HN before ChatGPT:

https://www.gally.net/miscellaneous/hn-em-dash-user-leaderbo...

  • They should include users who used a double hyphen, too -- not everyone has easy access to em dashes.

  • I have used a dash - like that for almost 20 years, 100% of the time I ought to use a semi-colon and about half of the time for commas - it let's me just keep talking about things, the comma is harder pause. I've recently started seriously writing at a literary level, and I have fallen in love with the em dash - it has a fantastic function within established professional writing, where it is used often - its why the AI uses it so much.

  • Apparently, it's not only em-dash that's distinctive. I've went through comments of the leader, and spot he also uses the backtick "’" instead of the apostrophe.

Projects like this remind me of a plot point in the Cyberpunk 2077 game universe. The "first internet" got too infected with dangerous AIs, so much so that a massive firewall needed to be built, and a "new" internet was built that specifically kept out the harmful AIs.

(Or something like that: it's been awhile since I played the game, and I don't remember the specific details of the story.)

It makes me wonder if a new human-only internet will need to be made at some point. It's mostly sci-fi speculation at this point, and you'd really need to hash out the details, but I am thinking of something like a meatspace-first network that continually verifies your humanity in order for you to retain access. That doesn't solve the copy-paste problem, or a thousand other ones, but I'm just thinking out loud here.

  • The problem really is that it is impossible to verify that the content someone uploads came from their mind and not a computer program. And at some point probably all content is at least influenced by AI. The real issue is also not that I used chatgpt to look up a synonym or asked a question before writing an article, the problem is when I copy paste the content and claim I wrote it.

    • The solution is not to be able to upload content. Extremely dumb services, basic trusted information sharing. Just like a newspaper.

    • > The problem really is that it is impossible to verify that the content someone uploads came from their mind and not a computer program.

      Er...digital id.

      1 reply →

    • There doesn't need to be any difference in treatment between AI slop and human slop. The point isn't to keep AI out - it's to keep spam and slop out. It doesn't matter whether it's produced by a being made of carbon or silicon.

      If someone can consistently produce high-quality content with AI assistance, so be it. Let them. Most don't, though.

      2 replies →

    • > the problem is when I copy paste the content and claim I wrote it

      Why is this the problem and not the reverse - using AI without adding anything original into the soup? I could paraphrase an AI response in my own words and it will be no better. But even if I used AI, if it writes my ideas, then it would not be AI slop.

    • > And at some point probably all content is at least influenced by AI.

      [citation needed]

      (I see absolutely no reason why that should be the case)

      2 replies →

  • I share an opinion with Nick Bostrom, once a civilization disrupting idea (like LLMs) is pulled out of the bag, there is no putting it back. People in isolation will recreate it simply because it's now possible. All we can do is adapt.

    That being said, the idea of a new freer internet is reality.. Mastodon is a great example. I think private havens like discord/matrix/telegram are an important step on the way.

  • Arguably this is already happening with much human-to-human interactions moving to private groups on Signal, WhatsApp, Telegram, etc.

  • There were also similar plot points mentioned in Peter Watts' Starfish trilogy, and Neal Stephenson's Anathem.

  • > a new human-only internet

    Only if those humans don't take their leads from AI. If they read AI and write, not much benefit.

besides for training future models, is this really such a big deal? most of the AI-gened text content is just replacing content-farm SEO-spam anyway. the same stuff that any half-awares person wouldn't have read in the past is now slightly better written, using more em dashes and instances of the word "delve". if you're consistently being caught out by this stuff then likely you need to improve your search hygiene, nothing so drastic as this

the only place I've ever had any issue with AI content is r/chess, where people love to ask ChatGPT a question and then post the answer as if they wrote it, half the time seemingly innocently, which, call me racist, but I suspect is mostly due to the influence of the large and young Indian contingent. otherwise I really don't understand where the issue lies. follow the exact same rules you do for avoiding SEO spam and you will be fine

  • In the past, I'd find one wrong answer and I could easily spot the copies. Now there's a dozen different sites with the same wrong answer, just with better formatting and nicer text.

    • The trick is to only search for topics where there are no answers, or only one answer leading to that blog post you wrote 10 years ago and forgot about.

  • A colleague sent me a confident ChatGPT formatted bug report.

    It misidentified what the actual bug was.

    But the tone was so confident, and he replied to my later messages using chat gpt itself, which insisted I was wrong.

    I don't like this future.

    • I have dozens of these over the years - many of the people responsible have "Head of ..." or "Chief ..." job titles now.

    • It's not the future. Tell him not to do that. If it happens again, bring it to the attention of his manager. Because that's not what he's being paid for. If he continues to do it, that's grounds for firing.

      What you're describing is not the future. It's a fireable offense.

  • > the only place I've ever had any issue with AI content is r/chess, where people love to ask ChatGPT a question and then post the answer as if they wrote it, half the time seemingly innocently

    Some of the science, energy, and technology subreddits receive a lot of ChatGPT repost comment. There are a lot of people who think they’ve made a scientific or philosophical breakthrough with ChatGPT and need to share it with the world.

    Even the /r/localllama subreddit gets constant AI spam from people who think they’ve vibecoded some new AI breakthrough. There have been some recent incidents where someone posted something convincing and then others wasted a lot of time until realizing the code didn’t accomplish what the post claimed it did.

    Even on HN some of the “Show HN” posts are AI garbage from people trying to build portfolios. I wasted too much time trying to understand one of them until I realized they had (unknowingly?) duplicated some commits from upstream project and then let the LLM vibe code a README that sounded like an amazing breakthrough. It was actually good work, but it wasn’t theirs. It was just some vibecoding tool eventually arriving at the same code as upstream and then putting the classic LLM written, emoji-filled bullet points in the README

  • Yes it is a big deal. I cant find new artists without having a fear of their art being AI generated, same for books and music. I also cant post my stuff to the internet anymore because I know its going to be fed into LLM training data. The internet is dead to me mostly and thankfully I lost almost all interest of being on my computer as much as I used to be.

  • > besides for training future models, is this really such a big deal? most of the AI-gened text content is just replacing content-farm SEO-spam anyway.

    Yes, it is because of the other side of the coin. If you are writing human-generated, curated content, previously you would just do it in your small patch of Internet, and probably SEs (Google...) will pick it up anyway because it was good quality content. You just didn't care about SEO-driven shit anyway. Now you nicely hand-written content is going to be fed into LLM training and it's going to be used - whatever you want it or not - in the next generation of AI slop content.

    • It's not slop if it is inspired from good content. Basically you need to add your original spices into the soup to make it not slop, or have the LLM do deep research kind of work to contrast among hundreds of sources.

      Slop did not originate from AI itself, but from the feed ranking Algorithm which sets the criteria for visibility. They "prompt" humans to write slop.

      AI slop is just an extension of this process, and it started long before LLMs. Platforms optimizing for their own interest at the expense of both users and creators is the source of slop.

    • this is basically the equivalent of saying that content-farm writers might read your content and bastardise it into seo slop. okay, sure, it's true, but it was always true and AI doesn't change it significantly

  • SEO-spam was often at least somewhat factual and not complete generated garbage. Recipe sites, for example, usually have a button that lets you skip the SEO stuff and get to the actual recipe.

    Also, the AI slop is covering almost every sentence or phrase you can think of to search. Before, if I used more niche search phrases and exact searches, I was pretty much guaranteed to get specific results. Now, I have to wade through pages and pages of nonsense.

  • Yes indeed, it is a problem. Now the old good sites have turned into AI-slop sites because they can't fight the spammers by writing slowly with humans.

    • if a potential defense is to simply the spammers, then the site was previously just as likely to start hiring content-farm human slop writers as they are now likely to use AI, i.e. the site probably wasn't that great in the first place and had equal potential to deteriorate, AI or no

The other day I was researching with ChatGPT.

* ChatGPT hallucinated an answer

* ChatGPT put it in my memory, so it persisted between conversations

* When asked for a citation, ChatGPT found 2 AI created articles to back itself up

It took a while, but I eventually found human written documentation from the organization that created the technical thingy I was investigating.

This happens A LOT for topics on the edge of knowledge easily found on the Web. Where you have to do true research, evaluate sources, and make good decisions on what you trust.

  • AI reminds me of combing through stackoverflow answers. The first one might work... Or it might not. Try again, find a different SO problem and answer. Maybe third times the charm...

    Except it's all via the chat bot and it isn't as easy to get it to move off of a broken solution.

For images, https://same.energy is a nice option that, being abandoned but still functioning since a few years, seems to naturally not have crawled any AI images. And it’s all around a great product.

Why use this when you can use the before: syntax on most search engines?

  • doesn't actually do anything anymore in Google or bing.

    • Searching Google for

      chatgpt

      vs

      chatgpt before:2022-01-01

      give me quite different results. In the 2nd query, most results have a date listed next to them in the results page, and that date is always prior to 2022. So the date filtering is "working". However, most of the dates are actually Google making a mistake and misinterpreting some unimportant date it found on the page as the date the page was created. At least one result is a Youtube video posted before 2022, that edited its title after Chatgpt was released to say Chatgpt.

      Disclosure: I work at Google, but not on search.

    • I use it frequently to find older websites to browse. Works relatively well for most search terms. If you want something from 2005 or before I find -inurl:https works well.

Most of college courses and school books haven't changed in decades. Some reputed college keep courses for Pascal and Fortran instead of Python or Java, just because, it might affect their reputation of being classical or pure or to match their campus buildings style.

google results were already 90% SEO crap long before ChatGPT

just use Kagi and block all SEO sites...

  • How do we (or Kagi) know which ones are "SEO sites"? Is there some filter list or other method to determine that?

    • If you took Google of 2006, and used that iteration of the pagerank algorithm, you’d probably not get most of the SEO spam that’s so prevalent in Google results today.

    • It seems like a mixture of heuristics, explicit filtering and user reports.

      https://help.kagi.com/kagi/features/slopstop.html

      That's specifically for AI generated content, but there are other indicators like how many affiliate links are on the page and how many other users have downvoted the site in their results. The other aspect is network effect, in that everyone tunes their sites to rank highly on Google. That's presumably less effective on other indices?

How about a search engine that only returns what you searched for, and not a million other unrelated things that it hopes you might like to buy?

This goes for you, too, website search.

Does this filter out traditional SEO blogfarms?

  • Yeah, might prefer AI-slop to marketing-slop.

    • They are the same. I was looking for something and tried AI. It gave me a list of stuff. When I asked for its sources, it linked me to some SEO/Amazon affiliate slop.

      All AI is doing is making it harder to know what is good information and what is slop, because it obscures the source, or people ignore the source links.

      1 reply →

I don't know how this works under the hood but it seems like no matter how it works, it could be gamed quite easily.

  • True, but there's probably many ways to do this and unless AI content starts falsifying tons of its metadata (which I'm sure would have other consequences), there's definitely a way.

    Plus other sites that link to the content could also give away it's date of creation, which is out of the control of the AI content.

    • I have heard of a forum (I believe it was Physics Forums) which was very popular in the older days of the internet where some of the older posts were actually edited so that they were completely rewritten with new content. I forget what the reasoning behind it was, but it did feel shady and unethical. If I remember correctly, the impetus behind it was that the website probably went under new ownership and the new owners felt that it was okay to take over the accounts of people who hadn't logged on in several years and to completely rewrite the content of their posts.

      I believe I learned about it through HN, and it was this blog post: https://hallofdreams.org/posts/physicsforums/

      It kind of reminds me of why some people really covet older accounts when they are trying to do a social engineering attack.

      1 reply →

  • If it's just using Google search "before <x date>" filtering I don't think there's a way to game it... but I guess that depends on whether Google uses the date that it indexed a page versus the date that a page itself declares.

    • Date displayed in Google Search results is often the self-described date from the document itself. Take a look at this "FOIA + before Jan 1, 1990" search: https://www.google.com/search?q=foia&tbs=cdr:1,cd_max:1/1/19...

      None of these documents were actually published on the web by then, incl., a Watergate PDF bearing date of Nov 21, 1974 - almost 20 years before PDF format got released. Of course, WWW itself started in 1991.

      Google Search's date filter is useful for finding documents about historical topics, but unreliable for proving when information actually became publicly available online.

      3 replies →

  • "Gamed quite easily" seems like a stretch, given that the target is definitionally not moving. The search engine is fundamentally searching an immutable dataset that "just" needs to be cleaned.

    • How? They have an index from a previous date and nothing new will be allowed since that date? A whole copy of the internet? I don't think so.... I'm guessing, like others, it's based on the date the user/website/blog lists in the post. Which they can change at any time.

      1 reply →

Just the other evening, as my family argued about whether some fact was or was not fake, I detached from the conversation and began fantasizing about whether it was still possible to buy a paper encyclopedia.

I didn’t know “eccentric engineering” was even a term before reading this. It’s fascinating how much creativity went into solving problems before large models existed. There’s something refreshing about seeing humans brute force the weird edges of a system instead of outsourcing everything to an LLM.

It also makes me wonder how future kids will see this era. Maybe it will look the same way early mechanical computers look to us. A short period where people had to be unusually inquisitive just to make things work.

  • Maybe like how I view my dad and the punchcard era: cool and endearing that he went through that, but thankful that I don’t have to.

I'm grateful that I published a large body of content pre-ChatGPT so that I have proof that I'm not completely inarticulate without AI.

If I want dead information I'll go find a newspaper. This is kind of silly. Even if AI rewrites the entire internet - we aren't going to live in a time capsule.

Plus, the AI already read everything made before 2023, so what does it matter?

Creatives need to think a bit bigger with this particular issue.

ChatGPT also returns content only created before ChatGPT release, which is why I still have to google damn it!

  • Is that still the case? And even if so how is it going to avoid keeping it like that in the future? Are they going to stop scraping new content, or are they going to filter it with a tool which recognizes their own content?

    • it's a known problem in ML, I think grok solved it partially and chatGPT uses another model on top to search web like suggested below. Hence MLOps field appeared, to solve models management

      I find it a bit annoying to navigate between hallucinations and outdated content. Too much invalid information to filter out.

I noticed AI-generated slop taking over google search results well before ChatGPT. So I don't agree with the premise on this site that you can be "you can be sure that it was written or produced by the human hand."

Not affiliated, but I've been using kagi's date range filter to similar effect. The difference in results for car maintenance subjects is astounding (and slightly infuriating).

I hope there's an uncensored version of the Internet Archive somewhere, I wish I could look at my website ca. 2001, but I think it got removed because of some fraudulent DMCA claim somewhere in the early 2010s.

> This is a search tool that will only return content created before ChatGPT's first public release on November 30, 2022.

How does it do that? At least Google seems to take website creation date metadata at face value.

The slop is getting worse, as there is so much llm generated shit online, now new models are getting trained on the slop. Slop training slop, and slop. We have gone full circle just in a matter of a few years.

  • I was replaying Cyberpunk 2077 and trying to think of all the ways one might have dialed up the dystopia to 11 (beyond what the game does). And pervasive AI slop was never on my radar. Kinda reminds me of the foreword in Neuromancer bringing attention to the fact the book was written before cellphones became popular. It's already fucking with my mind. I recently watched Frankenstein 2025 and 100% thought gen ai had a role in the CGI only to find out the director hates it so much he rather die than use it. I've been noticing little things in old movies and anime where I thought to myself (if I didn't know this was made before gen ai, I would have thought this was generated for sure). One example (https://www.youtube.com/watch?v=pGSNhVQFbOc&t=412) cityscape background in this a outro scene with buildings built on top of buildings gave me ai vibes (really the only thing in this whole anime), yet this came out ~1990. So I can already recognize a paranoia / bias in myself and really can't reliably tell what's real.. Probably also other people have this and why some non-zero number of people always thinks every blog post that comes out was written by gen ai.

    • I had the same experience, watching a nature documentary on a streaming service recently. It was... not so good, at least at the beginning. I was wondering if this was a pilot for AI generated content on this streaming service.

      Actually, it came out in 2015 and was just low budget.

This is an imperfect search extension.

It's a hell of a lot better than nothing, if one is using chrome or Firefox (neither of which are my primary browsers).

For a while I've been saying it's a pity we hadn't been regularly trusted-timestamping everything before that point as a matter of course.

What kind of heuristics does it use to determine age? a lot of content on Google actually backdates for some reason... presumably some sort of SEO scam?

For that purpose I do not update my book on LeanPub about Ruby. I just know one day people gonna read it more, because human-written content would be gold.

I really thought this was going to be the Dewey Decimal system. Exclude sources from this century. It’s the only way to be sure.

Of course my first thought was: Let's use this as a tool for AI searches (when I don't need recent news).

Can't we just append "before:2021-01-01" to Google?

I use this to find old news articles for instance.

Something generated by humans does not mean high quality.

In hindsight, that would've been a real utility use case for NFTs. A decentralized cryptographic prove that some content existed in a particular form at a particular moment.

This tool has no future. We have that in common with it, I fear.

What we really need to do is build an AI tool to filter out the AI automatically. Anybody want to help me found this company?

Interesting concept. As a side benefit this would allow you to make steady progress fighting SEO slop as well, since there can be no arms race if you are ignoring new content.

You could even add options for later cutoffs… for example, you could use today’s AIs to detect yesterday’s AI slop.

"This browser extension uses the Google search API to only return content published before Nov 30th, 2022 so you can be sure that it was written or produced by the human hand."

I mean I get it, but it seems a bit silly. What's next - an image search engine that only returns images created before photoshop?

We now need an extension to hide 3 years of the internet because it was written by robots. This timeline is undefeated.

[flagged]

  • Is it really here to stay? If the wheels fells off the investment train and ChatGPT etc. disappeared tomorrow, how many people would be running inference locally? I suspect most people either wouldn't meet the hardware requirements or would be too frustrated with the slow token generation to bother. My mom certainly wouldn't be talking to it anymore.

    Remember that a year or two ago, people were saying something similar about NFTs —that they were the future of sharing content online and we should all get used to it. Now, they still might exist, it's true, but they're much less pervasive and annoying than they once were.

    • Maybe you don't love your mom enough to do this, but if ChatGPT disappeared tomorrow and it was something she really used and loved, I wouldn't think twice before buying her a rig powerful enough to run a quantized downlodable model on, though I'm not current on which model or software would be the best for her purposes. I get that your relationship with your mother, or your financial situation might be different though.

      6 replies →

  • I don't agree it is 'almost worse' than the slop but it sure can be annoying. On one hand it seems even somewhat positive that some people developed a more critical attitude and question things they see, on the other hand they're not critical enough to realize their own criticism might be invalid. Plus I feel bad for all the resources (both human and machine) wasted on this. Like perfectly normal things being shown, but people not knowing anything about the subject chiming in to claim that it must be AI because they see something they do not fully understand.

    • My main exposure to this was just in a couple of online social communities.

      1. AI happens 2. Every response (that are often memes in themselves), is complaining about the AI. Hell, some of them were clever in the way a brand new meme template was in 2015. 3. Memeing about the AI happens to the point where a few borderline freaking death threats start sneaking in. 4. Someone posts thoughtful original content, the whole place degrades into a “thank god it’s not AI” meme.

      Or, let’s fragment our already tiny community into NO AI SLOP

      I’ve seen this exact thing happen in three very niche communities.

  • "You know what's almost worse than something bad? People complaining about something bad."

    • Shrug. Sure.

      Point still stands. It’s not going anywhere. And the literal hate and pure vitriol I’ve seen towards people on social media, even when they say “oh yeah; this is AI”, is unbelievable.

      So many online groups have just become toxic shitholes because someone once or twice a week posts something AI generated

      11 replies →