Comment by ofjcihen

1 day ago

I’m sure the new model is a step above the old one but I can’t be the only person who’s getting tired of hearing about how every new iteration is going to spell doom/be a paradigm shift/change the entire tech industry etc.

I would honestly go so far as to say the overhype is detrimental to actual measured adoption.

114 comments

ofjcihen

qnleigh 1 day ago

There is plenty of overhyping, no one denies that. But the antidote is not to dismiss everything. Ignore the words and look at the data.

In this case, I see a pretty strong case that this will significantly change computer security. They provide plenty of evidence that the models can create exploits autonomously, meaning that the cost of finding valuable security breaches will plummet once they're widely available.

kashyapc 20 hours ago
You seem to see a "pretty strong case" from a bombastic press release.
Don't get me wrong, I do know the reality has changed. Even Greg K-H, the Linux stable maintainer, did recently note[1] that it's not funny any more:
"Months ago, we were getting what we called 'AI slop,' AI-generated security reports that were obviously wrong or low quality," he said. "It was kind of funny. It didn't really worry us."
... "Something happened a month ago, and the world switched. Now we have real reports." It's not just Linux, he continued. "All open source projects have real reports that are made with AI, but they're good, and they're real." Security teams across major open source projects talk informally and frequently, he noted, and everyone is seeing the same shift. "All open source security teams are hitting this right now."
---
I agree that an antidote to the obnoxious hype is to pay attention to the actual capabilities and data. But let's not get too carried away.
[1] https://www.theregister.com/2026/03/26/greg_kroahhartman_ai_...
- ghaff 10 hours ago
  
  Hadn’t been to a Kubecon in about a year as I’ve been tending to go to just the European ones. I definitely felt a much stronger this is real vibe at this event from people like Greg KH.
4ndrewl 21 hours ago
Is there any actual independent data though, or verification of any of these claims?
As it stands this is just a marketing programme for all involved.
- H8crilA 21 hours ago
  
  Ffmpeg confirmed on Twitter that they sent the patches.
  
  3 replies →
- kachnuv_ocasek 21 hours ago
  
  What would be the product they're marketing by this campaign?
  
  5 replies →
- KoolKat23 18 hours ago
  
  That's pretty disingenuous, bordering on ridiculous.
  Do they have a record of lying to you? No.
  Go read the system card. It's a lot more tame than you think, peoples are taking pieces out of this and hyping it. Doesn't mean it's not valid.
killingtime74 1 day ago
Which sounds like a great thing. Less undiscovered security vulnerabilities
- harikb 21 hours ago
  
  The only people panicking are probably those state level actors who were using these for their own benefit.
ofjcihen 1 day ago
With the right prompting (mostly creating a narrative that justifies the subject matter as okay to perform) other models have already been doing this for me though. That’s another confusing bit for me about how this is portrayed and I refuse to believe I’m a revolutionary user right?
I mean I’m sitting on $10k worth of bug payouts right now partially because that was already a thing.
- dota_fanatic 1 day ago
  
  > Non-experts can also leverage Mythos Preview to find and exploit sophisticated vulnerabilities. Engineers at Anthropic with no formal security training have asked Mythos Preview to find remote code execution vulnerabilities overnight, and woken up the following morning to a complete, working exploit. In other cases, we’ve had researchers develop scaffolds that allow Mythos Preview to turn vulnerabilities into exploits without any human intervention.
  
  4 replies →
heyethan 19 hours ago

[dead]

jstummbillig 21 hours ago

> how every new iteration is going to spell doom/be a paradigm shift/change the entire tech industry etc.

It's much the dynamic between parents and a child. The child, with limited hindsight, almost zero insight and no ability to forecast, is annoyed by their parents. Nothing bad ever happens! Why won't parents stop being so worried all the time and make a fuss over nothing?

The parents, which the child somewhat starts to realize but not fully, have no clue what they are doing. There is a lot they don't know and are going to be wrong about, because it's all new to them. But, what they do have is a visceral idea of how bad things could be and that's something they have to talk to their child about too.

In the eyes of the parents the child is % dead all the time. Assigning the wrong % makes you look like an idiot and not being able to handle any % too. In the eyes of the child actions leading to death are not even a concept. Hitting the right balance is probably hard, but not for the reasons the child thinks.

maccard 20 hours ago
Disagree - we’re being told on one hand that we are 6 months away from AI writing all Code, and 3 months into that the tools are unusable for complex engineering [1]. Every time I mention this I’m told “but have you tried the latest model and this particular tool” - yes I have, but if I need to be on the hottest new model for it to be functional that means the last time you claimed it was solved, it wasn’t solved.
[0] https://news.ycombinator.com/item?id=47660925
- KronisLV 18 hours ago
  
  > Every time I mention this
  I feel like there’s a bunch of factors for why it will never be the same for many folks, from the models and harnesses, to the domains and existing tests/tooling.
  I feel bad for the people for whom it doesn’t work, but Claude Opus has written most of my code in 2026 so far. I had to build some tools around linting entire projects and most of my tokens are probably referencing existing stuff and parallel review iterations and tests, but it’s pretty nice and even seeing legacy code doesn’t make me want move to a farm and grow potatoes.
  It might be counter productive to be like: "Oh, just do X!" which works for the person suggesting it, and then have to do "But have you tried Y?" when it doesn't for the other person, if it just keeps being a never ending string of what works for one person not working for another.
  
  17 replies →
- ndr 18 hours ago
  
  Check out from this onwards and the following point. You get a nice summary on top right. Mind that Anthropic alone is doing 30B/y annualized already.
  Take a snapshot and check again in a few months. It's not perfect but it's much more falsifiable than a lot of the noise.
  https://ai-2027.com/#narrative-2026-04-30
  
  2 replies →
- LordDragonfang 18 hours ago
  
  > “I think… I don’t know… we might be six to twelve months away from when the model is doing most, maybe all of what SWEs (software engineers) do end to end.”
  I think it's disingenuous (as disingenuous as you're accusing these marketing teams of being) to paraphrase that as "being told on one hand that we are 6 months away from AI writing all Code". It's merely stating that it's a real possibility. (It's also disingenuous to use a post complaining about a behavioral regression bug as evidence that it's not progressing)
  Dismissing it as impossible is silly, considering how close it already is to a junior dev. Keep in mind that 14 months prior to that statement was before we even had any public reasoning models. Things really are moving that fast, it's just, at the moment, unclear how fast.
  
  3 replies →
ofjcihen 21 hours ago
That feels like a very complex way of looking at it. Another way would be to say “potentially profit seeking companies have an incentive to oversell products even if they’re good”.
- WithinReason 18 hours ago
  
  Is Anthropic lying about model capabilities? If not, where is the overselling?
  
  3 replies →
- ACCount37 21 hours ago
  
  [flagged]
  
  7 replies →
avaer 19 hours ago

The parents in this case are profiteering corporations on a mission to exploit the child for everything they can get away with, almost by definition.
It's a slightly different dynamic.
FridgeSeal 18 hours ago
I feel like you’re muddying 2 different arguments here. Or rather, 2 different positions.
You’re asserting that people who are tired of this line being wheeled out hold a position analogous to “what’s the big deal, nothing bad happens, just relax”. In reality, that’s only 1 position. The other position is “I understand fully, the consequences, but the relentless doomer language is tiring in the face of continuing-to-not-eventuate”.
- MaybiusStrip 4 minutes ago
  
  What do you think of people that say that about climate change? It seems you don't understand fully. This is not the time go get tired, right before this actually starts impacting jobs and people in other ways.
kubb 20 hours ago

It’s more like the abusive parents telling the child that they’ll sell him to the scary man at the bus stop every time they want to coerce the child into doing what they want.
Eventually the child develops disrespect for authority.
athrowaway3z 20 hours ago

This is just a really bad analogy. It doesn't addresses that there are multiple sources, the incentives to be telling us about it, and the spectrum between disaster-mitigation heroes and snake-oil salesmen.
materialpoint 19 hours ago

Did you compare AI companies to parents and engineers actually delivering value to toddlers? AI companies cannot, in any capacity, be regarded as caretakers.
haritha-j 15 hours ago

Sure, if the parent's stock price soared if the child dies.
juleiie 16 hours ago

Don’t take it personally but this amount of fear and paranoia about death on every corner sounds like a mental illness to me. Generalised Anxiety disorder to be precise. Maybe I am just not a parent.
In any case there are substances and realiable methods that fix whatever paralyzing existential dread anyone struggles with daily.
Probably best to use conventional route but I personally use special low thc, high cbg weed once a week with a medical grade vaporizer and once a year (early autumn) a moderate dose of golden teacher mushrooms. Although I understand that most people perhaps couldn’t due to not managing their own business but on a strict employment contract with urine tests.
therealpygon 14 hours ago

Are you suggesting these researchers somehow have wisdom and aren’t just guessing, and that everyone else are children too naive to understand the technology? It certainly sounds that way from the description you are attempting to apply.
This is two parents disagreeing on whether their child will automatically grow up to be a psychopath with one parent constantly remarking “if you teach that child how to cut bread, they will stab everyone later. If you teach that child to drive, they will run over everyone later”, not the “parents know better” situation you describe.
toraway 13 hours ago

An analogy that’s, quite literally, an appeal to paternalism to trust the motivations and pernicious incentive structures of the big AI labs.
bottom999mottob 19 hours ago

This is literally one the most infantilizing and simultaneously insulting analogies I've ever come across on this site. Do you really think consumers of the latest AI tools have no ability to forecast? The parents in this analogy have every incentive to lie
shafyy 20 hours ago

I'll have some of what you're having
lpln3452 19 hours ago

[dead]

nbardy 21 hours ago

There is step changes that actually merit this though. And a zero day machine IS one of those. It went from 4% zero day success rate to 85% on firefox.

Can you not see the significance of that?

ofjcihen 21 hours ago
I mean I work in this world and overhype is constant.
Additionally those numbers are somewhat meaningless without more context.
- jstummbillig 9 hours ago
  
  Can you explain why they are meaningless without more context?
  
  1 reply →

_the_inflator 19 hours ago

I side with you but on the other hand: this is how it works to get attention by those who aren't affiliated with computer science and AI.

I am totally annoyed as well and put any buzzwords in my personal bs filter. Java was revolutionary, the Apple I etc. ;)

On the other hand I see progress! AI enriched press releases balance buzzwords and information way better than marketing of large companies did before AI.

I remember throwing away an instruction for an electronic toothbrush away because - I won't mention the name but have a look at the upper tier - instead of putting something like "Turn toothbrush on, choose mode by pressing..." it read "Take your super awesome premium masterpiece using patented technology for the first time in human life now available to you by us. Move your finger over to the innovative sensory surface, that uses material from rocket scientists and world leading designers".

No joke. These were text blocks and repeated - 30 pages for one compact one.

The toothbrush is top notch, except for the instructions.

ofjcihen 12 hours ago

Hahaha I think we might have the same toothbrush.
That makes sense and I like the analogy.

alexey-salmin 20 hours ago

I think Claude Code with Sonnet 4.6 is already at the level of paradigm shift and can change the entire tech industry.

If you're paranoid it doesn't mean you're not being followed. If something is overhyped it doesn't mean it's not game-changing.

ofjcihen 12 hours ago

Oh I agree with you on that. But that’s partially why the language in the presser falls flat for me.
I mean as an example while web app pen testing I’ve been running and proxying all my traffic through it with instructions to find vulnerabilities with instructions telling it it’s a senior web app security export looking over my shoulder. It’s already great at that.
Ive even told it to do recon and run pen tests on lists of subdomains before (please for the love of god have the right harnesses and guardrails before you do this) and woken up to paid findings before.
So like I’m in a weird place where this was already happening and Mythos is being sold like it wasn’t good before?
End ramble :/

FiberBundle 14 hours ago

To me it makes absolutely zero sense that they would decide to not release the model to the public because of the effects that it would have due to its exploitation capabilities. Previous models were also capable of providing harmful information, yet that wasn't a problem, because models can actually be effectively censored using RHLF. So what is preventing Anthropic to simply forbid the model from letting people vibe-code exploits???

mik09 2 hours ago

a lot of times people cry wolf for a couple of times before wolf actually comes.

i feel like theres a good chance that this is the actual wolf coming here. cause i was using opus for a lot and it's really good.

DonsDiscountGas 15 hours ago

Everybody remembers the fable of the boy who cried wolf and how he died at the end. Left out of the story is the multiple other villagers who died of starvation because their flock of sheep was eaten. So because they didn't want to feel like suckers. Tuning out completely because of the existence of false positives is not a good choice.

raxxorraxor 16 hours ago

This looks more like another lobby group (quite a bad one) than something primarily focused on security.

The "urgency" is very likely mostly appreciated to drive policy.

gchadwick 13 hours ago

Remember OpenAI decided GPT 2 was far too dangerous to unleash upon the world when they first trained it!

nearbuy 12 hours ago

That's an editorialized headline. What they actually wrote was that it could be used to "generate misleading news articles, impersonate others online, automate the production of abusive or faked content to post on social media, [or] automate the production of spam/phishing content" and that they are aware other researchers have the ability to reproduce and open source their results, but this would give the community some time to decide how to proceed.
They were correct.

davenporten 13 hours ago

I came across this article just this morning saying AMD researchers, who hitherto have relied on Claude Code heavily, have noticed degraded performance in the recent update: https://www.theregister.com/2026/04/06/anthropic_claude_code...

Claude Code and Glasswing are not the same, but presumably they have a lot of overlap under the hood. I feel like while AI is certainly advancing in major ways, there will always be the up and down of new software releases.

akmiller 11 hours ago

Hasn't almost every model created a paradigm shift lately? Maybe it's you who has moved the needle on what a paradigm shift means?

aagha 11 hours ago

https://news.ycombinator.com/item?id=47682262

adam_patarino 15 hours ago

I’ve lost trust in anything they say.

The fear marketing is clearly intentional at this point.

baggachipz 15 hours ago

And the complicit, click-thirsty tech media falls for it every time.

eranation 11 hours ago

It feels to me full with marketing in the guise of trying to save the world from their own making. "we have a model so strong we can't release it, here are all the details of why it's so good, but don't ask for access, you can't get it, it's too risky for your own good"

Something smells really really weird:

1. Per the blog post[0]: "This was the most critical vulnerability we discovered in OpenBSD with Mythos Preview after a thousand runs through our scaffold. Across a thousand runs through our scaffold, the total cost was under $20,000 and found several dozen more findings"

Since they said it was patched, I tried to find the CVE, it looks like Mythos indeed found a 27 years old OpenBSD bug (fantastic), but it didn’t get a CVE and OpenBSD patched it and marked it as a reliability fix, am I missing something? [1]

2. From the same post, Anthropic red team decided to do a preview of their future responsible disclosure (is this a common practice?): "As we discuss below, we’re limited in what we can report here. Over 99% of the vulnerabilities we’ve found have not yet been patched" [0] So this is great, can't wait to see the actual CVEs, exploitability, likelihood, peer review, reproducibility, the kind of things the appsec community has been doing for at least the last 27 years since the CVE concept was introduced [2]

3. On the same day, an actual responsible disclosure, actual RCEs, actual CVEs, in Claude Code, that got discovered mostly because of the source code leak, I don't see anyone talking about it (you probably should upgrade your Claude Code though).

CVE-2026-35020 [3] CVE-2026-35021 [4] CVE-2026-35022 [5]

Do with this information as you may...

[0] https://red.anthropic.com/2026/mythos-preview/

[1] https://www.openbsd.org/errata78.html (look for 025)

[2] https://www.cve.org/Resources/General/Towards-a-Common-Enume...

[3] https://www.cve.org/CVERecord?id=CVE-2026-35020

[4] https://www.cve.org/CVERecord?id=CVE-2026-35021

[5] https://www.cve.org/CVERecord?id=CVE-2026-35022

nl 19 hours ago

Well Opus 4.5/4.6 kinda was right?

I mean software development has changed more since then than it has in my 30 year software development career.

jwpapi 20 hours ago

I agree I can’t open any social media no more

throwawayq3423 13 hours ago

> I can’t be the only person who’s getting tired of hearing about how every new iteration is going to spell doom/be a paradigm shift/change the entire tech industry etc.

There's a little bit of a grading your own homework aspect to companies being able to declare their new models revolutionary.

It doesn't mean they're wrong, but there is a clear conflict of interest.

heliumtera 15 hours ago

At launch, a technology is considered dangerous for being too powerful.

3 months later, you are an absolute idiot to still be using that useless model. Are you not using glasswing 2-01 high? Oh, yeah, glasswing from 3 months ago is absolutely worthless, every viber knows, it's your fault for holding it wrong.

For once you should not get too excited for new models release and words and adjectives promising things. Honestly it's your fault humanity lost its humanity and we just have words words words and mass schizophrenia

jillesvangurp 1 day ago

> I would honestly go so far as to say the overhype is detrimental to actual measured adoption.

I think you are a bit dishonest about how objectively you are measuring. From where I'm sitting, I don't know a lot of developers that still artisanally code like they did a few years ago. The question is no longer if they are using AI for coding but how much they are still coding manually. I myself barely use IDEs at this point. I won't be renewing my Intellij license. I haven't touched it in weeks. It doesn't do anything I need anymore.

As for security, I think enough serious people have confirmed that AI reported issues by the likes of Anthropic and OpenAI are real enough despite the massive amounts of AI slop that they also have to deal with in issue trackers. You can ignore that all you like. But I hope people that maintain this software take it a bit more seriously when people point out exploitable issues in their code bases.

The good news of course is that we can now find and fix a lot of these issues at scale and also get rid of whole categories of bugs by accelerating the project of replacing a lot of this software with inherently safer versions not written in C/C++. That was previously going to take decades. But I think we can realistically get a lot of that done in the years ahead.

I think some smart people are probably already plotting a few early moves here. I'd be curious to find out what e.g. Linus Torvalds thinks about this. I would not be surprised to learn he is more open to this than some people might suspect. He has made approving noises about AI before. I don't expect him to jump on the band wagon. But I do expect he might be open to some AI assisted code replacements and refactoring provided there are enough grown ups involved to supervise the whole thing. We'll see. I expect a level of conservatism but also a level of realism there.

junon 1 day ago
> From where I'm sitting, I don't know a lot of developers that still artisanally code like they did a few years ago.
You don't know a lot of developers then.
- literalAardvark 21 hours ago
  
  I do. The good ones use AI.
  
  1 reply →
ofjcihen 21 hours ago

> I think you are a bit dishonest about how objectively you are measuring
As someone who has made a sizable amount of money in security research while using Claude you might be right but not in the way you think.

dkersten 14 hours ago

And every single time what they release is underwhelming.

Remember how Sam spent like a year talking about how scary close GPT-5 was to AGI and then when it did finally come out... it was kinda meh.

AlexCoventry 20 hours ago

Do you think they're lying about the vulnerabilities they claim Mythos has found? Seems like a very short-term play, if so.

fullstackchris 11 hours ago

Agreed. Do we have any information on what these "vulnerabilities" actually are? Every vulnerability is typically immediately reported to CVE or NIST... are these "so destructive" they have to be kept behind closed doors? Give me a break...

blairharper 19 hours ago

[dead]

corranh 20 hours ago

It’s great marketing to lead with how the n+1 model is so amazing that you can’t have it yet.

Th3Alt3r 20 hours ago

yeah, they gotta find a way to build hype on every new model release