Comment by simonw
7 days ago
I built you this: https://tools.simonwillison.net/hacker-news-filtered
It shows you the Hacker News page with ai and llm stories filtered out.
You can change the exclusion terms and save your changes in localStorage.
o3 knocked it out for me in a couple of minutes: https://chatgpt.com/share/68766f42-1ec8-8006-8187-406ef452e0...
Initial prompt was:
Build a web tool that displays the Hacker
News homepage (fetched from the Algolia API)
but filters out specific search terms,
default to "llm, ai" in a box at the top but
the user can change that list, it is stored
in localstorage. Don't use React.
Then four follow-ups:
Rename to "Hacker News, filtered" and add a
clear label that shows that the terms will
be excluded
Turn the username into a link to
https://news.ycombinator.com/user?id=xxx -
include the comment count, which is in the
num_comments key
The text "392 comments" should be the link,
do not have a separate thread link
Add a tooltip to "1 day ago" that shows the
full value from created_at
That exclusion filter seems to be just a very dumb substring test? Try filtering out "a" and almost everything disappears. That means filtering out "ai" filters out "I used my brain not a computer".
I updated it to fetch 200 stories instead of 30, so even after filtering you still get hopefully 140+ things to read.
https://github.com/simonw/tools/commit/ccde4586a1d95ce9f5615...
I’ve built a site that does the same sort of exclusion filtering, with a lot more bells and whistles. Very much in the spirit of “what if HN stayed HN, but had actual, very useful features.”
Here’s an “anti-ai” timeline filter:
https://hcker.news/?filter=top30&exclude=llm%2C+vibe%2C+open...
I’m not using the Algolia API, I ingest the hn fire hose on my own server so the filtering is very fast.
Top story: Kiro: new agentic IDE
Just add “agent” to the search box. It’s saved in local storage.
I just added "agent" to the default exclusion list.
"Onedrive is slow on Linux but fast with a “Windows” user-agent"
"Agents raid home of fired Florida data scientist who built Covid-19 dashboard"
"Confessions of an ex-TSA agent"
"Terrible real estate agent photographs"
etc etc
2 replies →
Still seeing `Kiro: A new agentic IDE` BTW.
3 replies →
An interesting example of both LLMs' strengths and weaknesses. It is strong because you wrote a useful tool in a few minutes. It is weak because this tool is strongly coupled to the problem: filtering HN. It's an example of the more general problem of people wanting to control what they see. This has existed at least since the classic usenet "killfiles", but is an area that, I believe, has been ripe for a comprehensive local solution for some time.
OTOH, narrow solutions validate the broader solution, especially if there are a lot of them. Although in that case you invite a ton of "momentum" issues with ingrained user bases (and heated advocacy), hopelessly incompatible data models and/or UX models, and so on. It's an interesting world (in the Chinese curse sense) where such tools can be trivially created. It's not clear to me that fitness selection will work to clean up the landscape once it's made.
Not sure what a local solution would look like when what you see is on websites, maybe a browser extension? we just made a similar reskin as a website, and it works great, but is ultimately another site you have to go to. Its another narrow solution with some variation (we do use AI to do the ranking rather than keyword filtering), but im interested in the form factors that might give maximal control to a user.
It is strong because you believed it created something of value. Did it work ? Maybe. But regardless of whether it worked, you still believed in the value, and that is the "power" of AIs right now, that humans believe that they create value.
Probably would work better as a userscript, so you don't have to rely on a random personal website never going down just to use HN. I don't have a ChatGPT account but I am curious as to if it could do that automatically too.
Interesting idea, we could consider that as an alternative implementation to https://www.hackernews.coffee/. While we are planning on making it open-source, a userscript would be even more robust as a solution, although would need a personal API key to one of the services.
This is interesting, but I found it amusing that you used:
"I built..." and "o3 knocked it out in a couple minutes...", not ironically, talking about a tool to keep us from having to be inundated with AI/LLM stuff.
There's a special kind of irony to use AI to help out the people who hate AI.
It's not hypocrisy or anything negative like that, but I do find it amusing for some reason.
> to help out the people who hate AI.
Was it? I feel like it was clearly meant to be smug and inflammatory rather than useful in any meaningful way.
I was gong for smug, inflammatory and useful at the same time.
There is an even more special kind of irony to see it failing as the top ranked story now is "Kiro: A new agentic IDE"
Already fixed https://github.com/simonw/tools/commit/f95b306be7b584f388256...
6 replies →
I mean, many people who "hate AI" don't think that LLMs are useless for everything. I'm very unconvinced by e.g. using LLMs for coding, but that they'd be good at tagging content, sentiment analysis, etc.? That's not really hard to believe.
[dead]
This is neat, but with the given filters you autoselected (just the phrases "llm" and "ai"), of the 14 stories I see when I visit the page, 4 of them (more than 25%!) are still stories about AI. (At least one of them can't be identified by this kind of filtering because it doesn't use any AI-related words in its headline, arguably maybe two.)
people have said it elsewhere, but I think you might have to fight fire with fire if you want semantic filtering.
> of the 14 stories I see when I visit the page, 4 of them (more than 25%!)
Llm maths? ;)
25% of 14 is 3.5. 4 is more than 3.5. Ask grock if you still don't get it.
I also built https://lessnews.dev (HN filtered by webdev links)
One decision I had to make was whether the site should update in real time or be curated only. Eventually, I chose the latter because my personal goal is not to read every new link, but to read a few and understand them well.
Almost certain you can use the HN Algolia to do the same thing by excluding terms
I had to switch away from Algolia - the problem is they only model "show items on the homepage" using a tag that's automatically applied to exactly 30 items, which means any filtering knocks that down to eg 15.
I switched to using the older firebase API which can return up to 500 item IDs from the current (paginated) homepage - then I fetch and filter details of the first 200.
https://github.com/simonw/tools/commit/ccde4586a1d95ce9f5615...
I should have looked at your source, I didn't realize you were using the Algolia API
feature request for OP: sort by "LLM Agentic AI" embedding cosine distance desc
AI solving the too-much-AI complaint is heart-warming. We're at the point where we will start demanding organic and free-range software, not this sweatshop LLM one-shot vibery.
Love it. :D
It only shows 13 stories? And no pagination.
simon how do you get so much done? It’s incredible. Would love to see the day in the life TikTok :P
I think there is a fundamental disconnect in this response. What the user is asking for is for a procedural and cultural change. What you’ve come up with is a technical solution that kind of mimics a cultural change.
I don’t think it’s wrong, but I also don’t think we can really “AI generate” our way into better communities.
Simonw’s response is the right response. You should not bend the community to your will simply because you do not like the majority of the posts. Obviously many people do like those posts, as evidenced by them making the front page. Instead, find ways to avoid the topics you do not desire to read without forcing your will on people who are happy with the current state.
Let me stop folks early, don’t make comparisons to politics or any bullshit like that. We’re talking only about hacker news here.
You can even make it live with SSE/EventSource.
[flagged]
Please don't cross into personal attack.
In a thread devoted to filtering out AI, I gave them a way of filtering out AI.
(The fact that I wrote it using AI doesn't really matter, but I personally found it amusing so I included the prompts.)
> The fact that I wrote it using AI doesn't really matter
Given that it is a poorly implemented solution that doesn't really do what the OP asked, yes it is.
It really isn't incumbent on you to feed the trolls here.
Great example of the power of vibe coding. The first item is literally "Kiro: A new agentic IDE".
There is literally an input box to put terms you want to exclude...
The prompt asks for "filters out specific search terms", not "intelligently filter out any AI-related keywords." So yes, a good example of the power of vibe coding: the LLM built a tool according to the prompt.
> The prompt asks for "filters out specific search terms"
So if I want a front page free of LLM "agents" but also want to view stories about secret agents it will do that, right?
1 reply →
The prompt was to exclude llm and ai by default though
8 replies →
So I have to stay up to date on AI stories just to know what buzzwords I should filter so I don't see AI stories?
19 replies →
[flagged]
7 replies →
I like this because things can stay permanently filtered. Just not across devices. But that wasn't one of the original requirements.
Also a great example of how software can be perfectly to spec and also completely broken.
llm, ai, cuda, agent, gpt.
Wish it returned more unfiltered items tho.
Isn’t knocking out CUDA going to take out a significant chunk of GPGPU stuff with it? I can see wanting to avoid AI stuff, for sure, but I can’t imagine not wanting to hear anything about the high-bandwidth half of your computer…
1 reply →
[flagged]
Not at all. I think you misunderstood the point I was making here.
I think the idea of splitting Hacker News into AI and not AI is honestly a little absurd.
If you don't want to engage with LLM and AI content on Hacker News, don't engage with it! Scroll right on past. Don't click it.
If you're not going to do that, then since we are hackers here and we solve things by building tools, building a tool that filters out the stuff you don't want to see is trivial. So trivial I could get o3 to solve it for me in literally minutes.
(This is a pointed knock at the "AI isn't even useful crowd", just in case any of them are still holding out.)
There's a solid meta-joke here too, which is that filtering on keywords is a bad way to solve this problem. The better way to solve this profile... is to use AI!
So yeah, I'm not embarrassed at all. I think my joke and meta joke were pretty great.
It solves the problem in a simpler and faster way than OP requested. OP does not wish to see AI content, this tool solves it. Simple.
Your statement is factually incorrect. Have you no embarrassment?
[flagged]
Please don't cross into personal attack.
[flagged]
Now that's impressive. I've worked with and managed many humans and almost never do I get want I want back in one prompt.
Even ones with detailed specs and the human agreed to them don't come back exactly as written.
tf humans do you work with?
That's at least 5 JIRA tickets.
Also a lot of cursing that I’ve been told to cut down on by HR. (/s, kinda)
I think it's a bifurcation between 0-1 prompts (self-driven) and a 1,000 prompts :)
Perhaps you should add a privacy policy or just release the source rather than assume people will trust your site. Why do you do these demos if you aren't upfront about all the things the LLMs didn't do?
I released the source: https://github.com/simonw/tools/blob/main/hacker-news-filter... (Apache 2 licensed) and a commit history listing the prompts I used. https://github.com/simonw/tools/commits/main/hacker-news-fil... - also displayed on the site here: https://tools.simonwillison.net/colophon#hacker-news-filtere...
I don't think I need a privacy policy since the app is designed so that nothing gets logged anywhere - it works by hitting the Algolia API directly from your browser, but the filtering happens locally and is stored in localStorage so nobody on earth has the ability to see what you filtered.
The API it uses is https://hn.algolia.com/api/v1/search?tags=front_page - which is presumably logged somewhere (covered by Algolia's privacy policy) but doesn't serve any cookies.
> Why do you do these demos if you aren't upfront about all the things the LLMs didn't do?
What do you mean by that?
You should try to get other people to make your demos is all I'm saying. I don't know why you keep inserting yourself either. Why didn't someone else post the thing you made? Were they waiting for you to do it or do you think people aren't smart enough to do it? I'm just trying to understand why every damned LLM story has to feature you. In what ways could you avoid such a filter of your posts?
2 replies →
The site does not request any personal information, therefore no privacy policy is required.
It has no server side log? How do I know that if there is no policy?
2 replies →