Comment by reenorap
15 hours ago
Boris, you're seeing a ton of anecdotes here and Claude has done something that has affected a bunch of their most fervent users.
Jeff Bezos famously said that if the anecdotes are contradicting the metrics, then the metrics are measuring the wrong things. I suggest you take the anecdotes here seriously and figure out where/why the metrics are wrong.
On the subject of metrics, better user-facing metrics to understand and debug usage patterns would be a great addition. I'd love an easier way to understand the ave cost incurred by a specific skill, for example. (If I'm missing something obvious, let me know.)
Baking deeper analytics into CC would be helpful... similar to ccusage perhaps: https://github.com/ryoppippi/ccusage
This is useful if you want to keep an eye on what claude's actually doing behind the scenes: https://github.com/simple10/agents-observe
[dead]
We are taking it seriously, and are continuing to investigate. We are not trusting the metrics.
The quantitative ux research team at Google was created for exactly this problem: a service which became popular before the right metrics existed, meaning metrics need to be derived first, then optimized. We would observe users (irl), read their logs, then generate experiments to improve the behavior as measured by logs, and return to see if the experiment improves irl experiences. There were not many of us and we are around :)
I worked with Boris in the past and in my experience, Boris cares deeply about the customer. I'd vouch that Boris really cares about the issue people are running into.
1 reply →
Google products ux is widely acknowledged to be a steaming pile of shit though, so I am not sure you should follow their example.
Many of the metrics they use are obviously actively user hostile.
Thank you
Hopefully yourself, and not via your ai tools.
Cool, are you going to be transparent and explain the metrics and costs as a postmortem? And given the inability to actually audit what you produce, why should we trust Anthropic?
HN sometimes talks about pathological customers who will never be happy. Boris is probably the single best rep in the community, possibly ever.
The way your tone and complaints come across reminds me of this. As a paying customer ($5k spend per month in my corporate job), I’d rather anthropic keep doing what they’re doing — innovating and shipping useful stuff at blinding speed — and not index on your feedback. I think the tradeoffs they would cost far outweigh the consequences.
Dang man, chill.
22 replies →
It's incredible that Boris is here on HN being open and sharing an issue they don't fully understand yet, and offering a possible workaround. CTFO.
Thank you Boris.
4 replies →
Dude is on hacker news on a Sunday. half the GDP of the world is competing with him. What metrics would you like to see?
25 replies →
But the default 1M context window just rolled out a few weeks ago. If refreshing old sessions on 1M context windows is the problem, it's completely aligned with what Boris is saying.