Comment by cormorant
4 days ago
I'm fed up too. Spammy, AI-looking sites are showing up more and more. For some reason, many of them use the same Wordpress theme with a light gray table of contents - they look like this: https://imgur.com/a/totally-not-ai-generated-efsumgZ
The problem seems worse on "alternative" search engines, e.g. DuckDuckGo and Kagi, which both use Bing. It's been driving me back to Google.
A blocklist seems like a losing proposition, unless, like adblock filter lists, it balloons to tens of thousands of entries and gets updated constantly.
Unfortunately, this kind of blocklist is highly subjective. This list blocks MSN.com! That's hardly what I would have chosen.
Even Google is plagued by spam, I've tried all sorts of search techniques and alternative engines but I feel like the only solution seems to be doing things manually. I was already starting to block things by myself but I thought it'd be more productive to make the list public and try crowdsourcing. Even now, searching "how to partition a hard disk" would often drive you to low-effort sites telling you to use their software.
> Unfortunately, this kind of blocklist is highly subjective. This list blocks MSN.com! That's hardly what I would have chosen.
It's definitely a bit opinionated, but it's open to discussion - you can create an unblock request issue (if you care enough to do so, of course!). The reason I blocked MSN is that it just re-hosts articles from other websites, so I'd rather see the official source than be tricked into Microsoft's site which is very annoying, like how it opens another article if you scroll too fast down.
Recently learned a little trick for google. Adding `-ai` at the end of query helps. Not much, but something.
Afaik DDG is just Bing, whereas Kagi is using Google, Bing, (Yandex?) among others - https://help.kagi.com/kagi/search-details/search-sources.htm...
As a Kagi user I actually haven’t encountered much search result spam, surprised you’re seeing enough there to drive you back to Google!
> Unfortunately, this kind of blocklist is highly subjective. This list blocks MSN.com! That's hardly what I would have chosen.
I'm wondering how much the blacklist can be broken down into categories of spam. Sponsorblock for YouTube has a lot options around the types of things it'll skip through and the user has choice in how they're handled (skipped automatically, prompted to skip, simply highlighted in the scrubbar) at the category level.
I get tons when looking up recipes and cooking related information. Things that will say "X can be refrigerated for up to two weeks" then in the next paragraph "X is fine to refrigerate and eat for 2-3 days" or similar.
I'd block them but there seem to be infinite. They're probably buying 10+ character domains using random words/names/phrases in bulk.
I was just thinking... Depending on the type of articles one can pretty decently describe what makes it a good one. Recipes should be short texts that may link to a gallery, a video and to a text about it. They should have a section called ingredients and one for preparation and may have an author and a date. Research articles should cite sources elaborately.
You can use ublacklist without a list and just block shit sites as you see them.
I'm loving being able to search for something without getting results from garbage sites like howtogeek, stackoverflow, MSN, Pinterest, etc.
Since when is stack overflow a garbage site?