Show HN: What is HN thinking? Real-time sentiment and concept analysis

5 months ago (ethos.devrupt.io)

Hi HN,

I made Ethos, an open-source tool to visualize the discourse on Hacker News. It extracts entities, tracks sentiment, and groups discussions by concept.

Check it out: https://ethos.devrupt.io

This was a "budget build" experiment. I managed to ship it for under $1 in infra costs. Originally I was using `qwen3-8b` for the LLM and `qwen3-embedding-8b` for the embedding, but I ran into some capacity issues with that model and decided to use `llama-3.1-8b-instruct` to stay within a similar budget while having higher throughput.

What LLM or embedding would you have used within the same price range? It would need to be a model that supports structured output.

How bad do you think it is that `llama-3.1` is being used and then a higher dimension embedding? I originally wanted to keep the LLM and embedding within the same family, but I'm not sure if there is munch point in that.

Repo: https://github.com/devrupt-io/ethos

I'm looking for feedback on which metrics (sentiment vs. concepts) you find most interesting! PRs welcome!

26 comments

ddtaylor

kretaceous 5 months ago

This is really cool and something I've envisioned building for a long time!

There is a bug in the entity tracking. For the entity "github", it shows a positive sentiment. HN does NOT like GitHub (for reasons good or bad). If you click on it, it shows you stories about other seemingly unrelated stories.

https://ethos.devrupt.io/entities/github

ddtaylor 5 months ago

Thank you. I believe this is because it's not properly aggregating the story title, content, and comment hierarchy. There are going to be cases where the LLM does a poor job of understanding the conversation, but I think right now the information isn't being sent to the prompt.
Right now it seems to be only using one level of the parent comment hierarchy.
(Source: https://github.com/devrupt-io/ethos/blob/67670eb2855b84d389d...)

sdwr 5 months ago

Awesome idea! The entity tracking is very exciting, most interesting part imo

I think the budget is noticeable in the sentiment analysis unfortunately, the tags and entity recognition are good but the sentiment ratings themselves seem pretty sloppy.

ddtaylor 5 months ago

I think it's mostly prompting, but I will be experimenting with this more. The prompt currently is garbage IMO

    You are an expert analyst of the Hacker News community. Analyze submissions for
    the underlying ideas, concepts, technologies, and entities being discussed.

    Write all summaries in third-person analytical prose. Do NOT start sentences
    with "The user", "The commenter", "The author", or "This post". Instead, lead
    with the substance: describe the idea, argument, or phenomenon directly.

    Good: "Decentralized identity systems could reduce reliance on corporate
    gatekeepers." Bad: "The user discusses how decentralized identity systems work."

(Source: https://github.com/devrupt-io/ethos/blob/67670eb2855b84d389d...)

atoav 5 months ago

Garbage, why? That is the insightful bit you chose to omit. How would you do it instead?

1 reply →

esseph 5 months ago

This is virtually identical to tools the US Department of Homeland Security uses across each social media platform and major website with comments to monitor sentiment and activities.

Congrats, I guess.

ddtaylor 5 months ago

I was also told this by someone randomly while working at a coffee shop here in DC. Something about CGA.

tangotaylor 5 months ago

The sentiment analysis is very interesting. I'm super curious what that looks like historically, going back to 2007.

ddtaylor 5 months ago

I currently have it limited to this "epoch" date while I tweak the prompts, once I feel the prompt is done cooking I will be letting it go back to 2007. But, also, gotta keep the lights on somehow ;)
Also, hello fellow taylor.

Lapsa 5 months ago

I'm thinking about constantly getting bombarded with audible microwave voice messages for past couple years

ddtaylor 5 months ago
Epstein was written in COBOL because of static analysis.
- Lapsa 5 months ago
  
  and how that's related?
  
  2 replies →

sixtyj 5 months ago

Well done.

If I could suggest, please make green colors more distinct in sentiment split wheel, they seem to be very similar now.

vivzkestrel 5 months ago

any blog post anywhere that explains how all of this stuff works and the architecture etc?

ddtaylor 5 months ago

We wrote this https://blog.devrupt.io/posts/introducing-ethos/

claudegamedev 5 months ago

Jeffrey Epstein: 0.20% Positive! Lol.

Side note: this is cool, but the sentiment analysis could be a bit more sophisticated in v2.

CatMustard 5 months ago
I know I'm going against the HN hivemind a bit here, and I hope I don't get flamed too much for it - but I think that that Jeff Epstein fellow wasn't a very nice man.
- ddtaylor 5 months ago
  
  guidelines link

NedF 5 months ago

[dead]

lesser-shadow 5 months ago

[dead]

dk8996 5 months ago

Very interesting. LLMs open up space for transforming unstructured raw data into visualizations and dashboards. I made something just looking at “Who wants to be hired” posts.

https://hireindex.xyz/#stats

ddtaylor 5 months ago

Does that use the "real" LinkedIn API or something else like Playwright?
What model does it use?
What vector database is it using?