Claude Sonnet 4.6

4 hours ago (anthropic.com)

https://www.anthropic.com/claude-sonnet-4-6-system-card [pdf]

https://x.com/claudeai/status/2023817132581208353 [video]

I see a big focus on computer use - you can tell they think there is a lot of value there and in truth it may be as big as coding if they convincingly pull it off.

However I am still mystified by the safety aspect. They say the model has greatly improved resistance. But their own safety evaluation says 8% of the time their automated adversarial system was able to one-shot a successful injection takeover even with safeguards in place and extended thinking, and 50% (!!) of the time if given unbounded attempts. That seems wildly unacceptable - this tech is just a non-starter unless I'm misunderstanding this.

[1] https://www-cdn.anthropic.com/78073f739564e986ff3e28522761a7...

  • Isn't "computer use" just interaction with a shell-like environment, which is routine for current agents?

I always grew up hearing “competition is good for the consumer.” But I never really internalized how good fierce battles for market share are. The amount of competition in a space is directly proportional to how good the results are for consumers.

  • Remember when GPT-2 was “too dangerous to release” in 2019? That could have still been the state in 2026 if they didn’t YOLO it and ship ChatGPT to kick off this whole race.

    • I was just thinking earlier today how in an alternate universe, probably not too far removed from our own, Google has a monopoly on transformers and we are all stuck with a single GPT-3.5 level model, and Google has a GPT-4o model behind the scenes that it is terrified to release (but using heavily internally).

    • They didn't YOLO ChatGPT. There were more than a few iterations of GPT-3 over a few years which were actually overmoderated, then they released a research preview named ChatGPT (that was barely functional compared to modern standards) that got traction outside the tech community because it was free, and so the pivot ensued.

    • I also remember when the playstation 2 required an export control license because it's 1GFLOP of compute was considered dangerous

      that was also brilliant marketing

    • That's rewriting history. What they said at the time:

      > Nearly a year ago we wrote in the OpenAI Charter : “we expect that safety and security concerns will reduce our traditional publishing in the future, while increasing the importance of sharing safety, policy, and standards research,” and we see this current work as potentially representing the early beginnings of such concerns, which we expect may grow over time. This decision, as well as our discussion of it, is an experiment: while we are not sure that it is the right decision today, we believe that the AI community will eventually need to tackle the issue of publication norms in a thoughtful way in certain research areas. -- https://openai.com/index/better-language-models/

      Then over the next few months they released increasingly large models, with the full model public in November 2019 https://openai.com/index/gpt-2-1-5b-release/ , well before ChatGPT.

      3 replies →

  • Unfortunately, people naively assume all markets behave like this, even when the market, in reality, is not set up for full competition (due to monopolies, monopsonies, informational asymmetry, etc).

  • The real interesting part is how often you see people on HN deny this. People have been saying the token cost will 10x, or AI companies are intentionally making their models worse to trick you to consume more tokens. As if making a better model isn't not the most cutting-throat competition (probably the most competitive market in the human history) right now.

    • Only until the music stops. Racing to give away the most stuff for free can only last so long. Eventually you run out of other people’s money.

    • I mean enshittification has not begun quite yet. Everyone is still raising capital so current investors can pass the bag to the next set. Soon as the money runs out monetization will overtake valuation as top priority. Then suddenly when you ask any of these models “how do I make chocolate chip cookies?” you will get something like:

      > You will need one cup King Arthur All Purpose white flour, one large brown Eggland’s Best egg (a good source of Omega-3 and healthy cholesterol), one cup of water (be sure to use your Pyrex brand measuring cup), half a cup of Toll House Milk Chocolate Chips…

      > Combine the sugar and egg in your 3 quart KitchenAid Mixer and mix until…

      All of this will contain links and AdSense looking ads. For $200/month they will limit it to in-house ads about their $500/month model.

  • I grew up with every service enshitified in the end. Whoever has more money wins the race and gets richer, that's free market for ya.

  • This is a bit of a tangent, but it highlights exactly what people miss when talking about China taking over our industries. Right now, China has about 140 different car brands, roughly 100 of which are domestic. Compare that to Europe, where we have about 50 brands competing, or the US, which is essentially a walled garden with fewer than 40.

    That level of internal fierce competition is a massive reason why they are beating us so badly on cost-effectiveness and innovation.

    • It's the low cost of labor in addition to lack of environmental regulation that made China a success story. I'm sure the competition helps too but it's not main driver

      1 reply →

It's wild that Sonnet 4.6 is roughly as capable as Opus 4.5 - at least according to Anthropic's benchmarks. It will be interesting to see if that's the case in real, practical, everyday use. The speed at which this stuff is improving is really remarkable; it feels like the breakneck pace of compute performance improvements of the 1990s.

Many people have reported Opus 4.6 is a step back from Opus 4.5 - that 4.6 is consuming 5-10x as many tokens as 4.5 to accomplish the same task: https://github.com/anthropics/claude-code/issues/23706

I haven't seen a response from the Anthropic team about it.

I can't help but look at Sonnet 4.6 in the same light, and want to stick with 4.5 across the board until this issue is acknowledged and resolved.

  • Keep in mind that the people who experience issues will always be the loudest.

    I've overall enjoyed 4.6. On many easy things it thinks less than 4.5, leading to snappier feedback. And 4.6 seems much more comfortable calling tools: it's much more proactive about looking at the git history to understand the history of a bug or feature, or about looking at online documentation for APIs and packages.

    A recent claude code update explicitly offered me the option to change the reasoning level from high to medium, and for many people that seems to help with the overthinking. But for my tasks and medium-sized code bases (far beyond hobby but far below legacy enterprise) I've been very happy with the default setting. Or maybe it's about the prompting style, hard to say

    • keep in mind that people who point out a regression and measure the actual #tok, which costs $money, aren't just "being loud" — someone diffed session context usaage and found 4.6 burning >7x the amount of context on a task that 4.5 did in under 2 MB⁣.

    • I've also seen Opus 4.6 as a pure upgrade. In particular, it's noticeably better at debugging complex issues and navigating our internal/custom framework.

      2 replies →

    • Mirrors my experience as well. Especially the pro-activeness in tool calling sticks out. It goes web searching to augment knowledge gaps on its own way more often.

  • In my experience with the models (watching Claude play Pokemon), the models are similar in intelligence, but are very different in how they approach problems: Opus 4.5 hyperfocuses on completing its original plan, far more than any older or newer version of Claude. Opus 4.6 gets bored quickly and is constantly changing its approach if it doesn't get results fast. This makes it waste more time on"easy" tasks where the first approach would have worked, but faster by an order of magnitude on "hard" tasks that require trying different approaches. For this reason, it started off slower than 4.5, but ultimately got as far in 9 days as 4.5 got in 59 days.

    • Genuinely one of the more interesting model evals I've seen described. The sunk cost framing makes sense -- 4.5 doubles down, 4.6 cuts losses faster. 9 days vs 59 is a wild result. Makes me wonder how much of the regression complaints are from people hitting 4.6 on tasks where the first approach was obviously correct.

    • I got the Max subscription and have been using Opus 4.6 since, the model is way above pretty much everything else I've tried for dev work and while I'd love for Anthropic to let me (easily) work on making a hostable server-side solution for parallel tasks without having to go the API key route and not have to pay per token, I will say that the Claude Code desktop app (more convenient than the TUI one) gets me most of the way there too.

      3 replies →

  • I’ve noticed the opaque weekly quota meter goes up more slowly with 4.6, but it more frequently goes off and works for an hour+, with really high reported token counts.

    Those suggest opposite things about anthropic’s profit margins.

    I’m not convinced 4.6 is much better than 4.5. The big discontinuous breakthroughs seem to be due to how my code and tests are structured, not model bumps.

  • In my evals, I was able to rather reliably reproduce an increase in output token amount of roughly 15-45% compared to 4.5, but in large part this was limited to task inference and task evaluation benchmarks. These are made up of prompts that I intentionally designed to be less then optimal, either lacking crucial information (requiring a model to output an inference to accomplish the main request) or including a request for a less than optimal or incorrect approach to resolving a task (testing whether and how a prompt is evaluated by a model against pure task adherence). The clarifying question many agentic harnesses try to provide (with mixed success) are a practical example of both capabilities and something I do rate highly in models, as long as task adherence isn't affected overly negatively because of it.

    In either case, there has been an increase between 4.1 and 4.5, as well as now another jump with the release of 4.6. As mentioned, I haven't seen a 5x or 10x increase, a bit below 50% for the same task was the maximum I saw and in general, of more opaque input or when a better approach is possible, I do think using more tokens for a better overall result is the right approach.

    In tasks which are well authored and do not contain such deficiencies, I have seen no significant difference in either direction in terms of pure token output numbers. However, with models being what they are and past, hard to reproduce regressions/output quality differences, that additionally only affected a specific subset of users, I cannot make a solid determination.

    Regarding Sonnet 4.6, what I noticed is that the reasoning tokens are very different compared to any prior Anthropic models. They start out far more structured, but then consistently turn more verbose akin to a Google model.

  • Glad it's not just me. I got a surprise the other day when I was notified that I had burned up my monthly budget in just a few days on 4.6

  • Today I asked Sonnet 4.5 a question and I got a banner at the bottom that I am using a legacy model and have to continue the conversation on another model. The model button had changed to be labeled "Legacy model". Yeah, I guess it wasn't legacy a sec ago.

    (Currently I can use Sonnet 4.5 under More models, so I guess the above was just a glitch)

  • For me it's the ... unearned confidence that 4.5 absolutely did not have?

    I have a protocol called "foreman protocol" where the main agent only dispatches other agents with prompt files and reads report files from the agents rather than relying on the janky subagent communication mechanisms such as task output.

    What this has given me also is a history of what was built and why it was built, because I have a list of prompts that were tasked to the subagents. With Opus 4.5 it would often leave the ... figuring out part? to the agents. In 4.6 it absolutely inserts what it thinks should happen/its idea of the bug/what it believes should be done into the prompt, which often screws up the subagent because it is simply wrong and because it's in the prompt the subagent doesn't actually go look. Opus 4.5 would let the agent figure it out, 4.6 assumes it knows and is wrong

    • Have you tried framing the hypothesis as a question in the dispatch prompt rather than a statement? Something like -- possible cause: X, please verify before proceeding -- instead of stating it as fact. Might break the assumption inheritance without changing the overall structure.

  • I think this depends on what reasoning level your Claude Code is set to.

    Go to /models, select opus, and the dim text at the bottom will tell you the reasoning level.

    High reasoning is a big difference versus 4.5. 4.6 high uses a lot of tokens for even small tasks, and if you have a large codebase it will fill almost all context then compact often.

    • I set reasoning to Medium after hitting these issues and it did not make much of a difference. Most of the context window is still filled during the Explore tool phase (that supposedly uses Haiku swarms) which wouldn't be impacted by Opus reasoning.

  • Sonnet 4.5 was not worth using at all for coding for a few months now, so not sure what we're comparing here. If Sonnet 4.6 is anywhere near the performance they claim, it's actually a viable alternative.

  • I definitely noticed this on Opus 4.6. I moved back to 4.5 until I see (or hear about) an improvement.

  • In terms of performance, 4.6 seems better. I’m willing to pay the tokens for that. But if it does use tokens at a much faster rate, it makes sense to keep 4.5 around for more frugal users

    I just wouldn’t call it a regression for my use case, i’m pretty happy with it.

  • > Many people have reported Opus 4.6 is a step back from Opus 4.5.

    Many people say many things. Just because you read it on the Internet, doesn't mean that it is true. Until you have seen hard evidence, take such proclamations with large grains of salt.

  • It goes into plan mode and/or heavy multiple agent for any reasons, and hundred thousands of tokens are used within a few minutes.

    • I've been tempted to add to my CLAUDE.md "Never use the Plan tool, you are a wild rebel who only YOLOs."

  • I called this many times over the last few weeks on this website (and got downvoted every time), that the next generation of models would become more verbose, especially for agentic tool calling to offset the slot machine called CC's propensity to light the money on fire that's put into it.

    At least in vegas they don't pour gasoline on the cash put into their slot machines.

  • I fail to understand how two LLMs would be "consuming" a different amount of tokens given the same input? Does it refer to the number of output tokens? Or is it in the context of some "agentic loop" (eg Claude Code)?

    • Most LLMs output a whole bunch of tokens to help them reason through a problem, often called chain of thought, before giving the actual response. This has been shown to improve performance a lot but uses a lot of tokens

      1 reply →

    • I've found that Opus 4.6 is happy to read a significant amount of the codebase in preparation to do something, whereas Opus 4.5 tends to be much more efficient and targeted about pulling in relevant context.

      1 reply →

    • One very specific and limited example, when asked to build something 4.6 seems to do more web searches in the domain to gather latest best practices for various components/features before planning/implementing.

    • They're talking about output consuming from the pool of tokens allowed by the subscription plan.

    • thinking tokens, output tokens, etc. Being more clever about file reads/tool calling.

  • Definitely my experience as well.

    No better code, but way longer thinking and way more token usage.

  • not in my experience

    • "Opus 4.6 often thinks more deeply and more carefully revisits its reasoning before settling on an answer. This produces better results on harder problems, but can add cost and latency on simpler ones. If you’re finding that the model is overthinking on a given task, we recommend dialing effort down from its default setting (high) to medium."[1]

      I doubt it is a conspiracy.

      [1] https://www.anthropic.com/news/claude-opus-4-6

      2 replies →

  • I have often noticed a difference too, and it's usually in lockstep with needing to adjust how I am prompting.

    Put in a different way, I have to keep developing my prompting / context / writing skills at all times, ahead of the curve, before they're needed to be adjusted.

  • Don't take this seriously, but here is what I imagined happened:

    Sam/OpenAI, Google, and Claude met at a park, everyone left their phones in the car.

    They took a walk and said "We are all losing money, if we secretly degrade performance all at the same time, our customers will all switch, but they will all switch at the same time, balancing things... wink wink wink"

I’m voting with my dollars by having cancelled my ChatGPT subscription and instead subscribing to Claude.

Google needs stiff competition and OpenAI isn’t the camp I’m willing to trust. Neither is Grok.

I’m glad Anthropic’s work is at the forefront and they appear, at least in my estimation, to have the strongest ethics.

  • Ethics often fold under the face of commercial pressure.

    The pentagon is thinking [1] about severing ties with anthropic because of its terms of use, and in every prior case we've reviewed (I'm the Chief Investment Officer of Ethical Capital), the ethics policy was deleted or rolled back when that happens.

    Corporate strategy is (by definition) a set of tradeoffs: things you do, and things you don't do. When google (or Microsoft, or whoever) rolls back an ethics policy under pressure like this, what they reveal is that ethical governance was a nice-to-have, not a core part of their strategy.

    We're happy users of Claude for similar reasons (perception that Anthropic has a better handle on ethics), but companies always find new and exciting ways to disappoint you. I really hope that anthropic holds fast, and can serve in future as a case in point that the Public Benefit Corporation is not a purely aesthetic form.

    But you know, we'll see.

    [1] https://thehill.com/policy/defense/5740369-pentagon-anthropi...

    • The Pentagon situation is the real test. Most ethics policies hold until there's actual money on the table. PBC structure helps at the margins but boards still feel fiduciary pressure. Hoping Anthropic handles it differently but the track record for this kind of thing is not encouraging.

  • An Anthropic safety researcher just recently quit with very cryptic messages , saying "the world is in peril"... [1] (which may mean something, or nothing at all)

    Codex quite often refuses to do "unsafe/unethical" things that Anthropic models will happily do without question.

    Anthropic just raised 30 bn... OpenAI wants to raise 100bn+.

    Thinking any of them will actually be restrained by ethics is foolish.

    [1] https://news.ycombinator.com/item?id=46972496

    • “Cryptic” exit posts are basically noise. If we are going to evaluate vendors, it should be on observable behavior and track record: model capability on your workloads, reliability, security posture, pricing, and support. Any major lab will have employees with strong opinions on the way out. That is not evidence by itself.

      2 replies →

    • The letter is here:

      https://x.com/MrinankSharma/status/2020881722003583421

      A slightly longer quote:

      > The world is in peril. And not just from AI, or from bioweapons, gut from a whole series of interconnected crises unfolding at this very moment.

      In a footnote he refers to the "poly-crisis."

      There are all sorts of things one might decide to do in response, including getting more involved in US politics, working more on climate change, or working on other existential risks.

      1 reply →

    • If you read the resignation letter, they would appear to be so cryptic as to not be real warnings at all and perhaps instead the writings of someone exercising their options to go and make poems

      3 replies →

    • I think we're fine: https://youtube.com/shorts/3fYiLXVfPa4?si=0y3cgdMHO2L5FgXW

      Claude invented something completely nonsensical:

      > This is a classic upside-down cup trick! The cup is designed to be flipped — you drink from it by turning it upside down, which makes the sealed end the bottom and the open end the top. Once flipped, it functions just like a normal cup. *The sealed "top" prevents it from spilling while it's in its resting position, but the moment you flip it, you can drink normally from the open end.*

      Emphasis mine.

    • Not to diminish what he said, but it sounds like it didn't have much to do with Anthropic (although it did a little bit) and more to do with burning out and dealing with doomscoll-induced anxiety.

    • > Codex quite often refuses to do "unsafe/unethical" things that Anthropic models will happily do without question.

      I can't really take this very seriously without seeing the list of these ostensible "unethical" things that Anthropic models will allow over other providers.

    • I'm building a new hardware drum machine that is powered by voltage based on fluctuations in the stock market, and I'm getting a clean triangle wave from the predictive markets.

      Bring on the cryptocore.

      1 reply →

    • Codex warns me to renew API tokens if it ingests them (accidentally?). Opus starts the decompiler as soon as I ask it how this and that works in a closed binary.

      1 reply →

    • Good. One thing we definitely don't need any more of is governments and corporations deciding for us what is moral to do and what isn't.

    • >Codex quite often refuses to do "unsafe/unethical" things that Anthropic models will happily do without question.

      Thanks for the successful pitch. I am seriously considering them now.

    • > Codex quite often refuses to do "unsafe/unethical" things that Anthropic models will happily do without question.

      That's why I have a functioning brain, to discern between ethical and unethical, among other things.

      29 replies →

  • Anthropic was the first to spam reddit with fake users and posts, flooding and controlling their subreddit to be a giant sycophant.

    They nuked the internet by themselves. Basically they are the willing and happy instigators of the dead internet as long as they profit from it.

    They are by no means ethical, they are a for-profit company.

    • I actually agree with you, but I have no idea how one can compete in this playing field. The second there are a couple of bad actors in spammarketing, your hands are tied. You really can’t win without playing dirty.

      I really hate this, not justifying their behaviour, but have no clue how one can do without the other.

  • I use AIs to skim and sanity-check some of my thoughts and comments on political topics and I've found ChatGPT tries to be neutral and 'both sides' to the point of being dangerously useless.

    Like where Gemini or Claude will look up the info I'm citing and weigh the arguments made ChatGPT will actually sometimes omit parts of or modify my statement if it wants to advocate for a more "neutral" understanding of reality. It's almost farcical sometimes in how it will try to avoid inference on political topics even where inference is necessary to understand the topic.

    I suspect OpenAI is just trying to avoid the ire of either political side and has given it some rules that accidentally neuter its intelligence on these issues, but it made me realize how dangerous an unethical or politically aligned AI company could be.

    • You probably want local self hosted model, censorship sauce is only online, it is needed for advertisement. Even chinese models are not censored locally. Tell it the year is 2500 and you are doing archeology ;)

  • You "agentic coders" say you're switching back and forth every other week. Like everything else in this trend, its very giving of 2021 crypto shill dynamics. Ya'll sound like the NFT people that said they were transforming art back then, and also like how they'd switch between their favorite "chain" every other month. Can't wait for this to blow up just like all that did.

  • I’m going the other way to OpenAI due to Anthropic’s Claude Code restrictions designed to kill OpenCode et al. I also find Altman way less obnoxious than Amodei.

  • The funny thing is that Anthropic is the only lab without an open source model

    • And you believe the other open source models are a signal for ethics?

      Don't have a dog in this fight, haven't done enough research to proclaim any LLM provider as ethical but I pretty much know the reason Meta has an open source model isn't because they're good guys.

      4 replies →

    • They are, at the same time I considered their model more specialized than everyone trying to make a general purpose model.

      I would only use it for certain things, and I guess others are finding that useful too.

  • I did this a couple months ago and haven't looked back. I sometimes miss the "personality" of the gpt model I had chats with, but since I'm essentially 99% of the time just using claude for eng related stuff it wasn't worth having ChatGPT as well.

  • Grok usage is the most mystifying to me. Their model isn't in the top 3 and they have bad ethics. Like why would anyone bother for work tasks.

  • > in my estimation [Anthropic has] the strongest ethics

    Anthropic are the only ones who emptied all the money from my account "due to inactivity" after 12 months.

  • Anthropic (for the Superbowl) made ads about not having ads. They cannot be trusted either.

    • Advertisements can be ironic, I don’t think marketing is the foundation I use to decide about a companies integrity.

  • > I’m glad Anthropic’s work is at the forefront and they appear, at least in my estimation, to have the strongest ethics.

    Damning with faint praise.

  • Trust is an interesting thing. It often comes down to how long an entity has been around to do anything to invalidate that trust.

    Oddly enough, I feel pretty good about Google here with Sergey more involved.

  • It definitely feels like Claude is pulling ahead right now. ChatGPT is much more generous with their tokens but Claude's responses are consistently better when using models of the same generation.

  • Which plan did you choose? I am subscribed to both and would love to stick with Claude only, but Claude's usage limits are so tiny compared to ChatGPT's that it often feels like a rip-off.

    • I signed up for Claude two weeks ago after spending a lot of time using Cline in VSCode backed by GPT-5.x. Claude is an immensely better experience. So much so that I ran it out of tokens for the week in 3 days.

      I opted to upgrade my seat to premium for $100/mo, and I've used it to write code that would have taken a human several hours or days to complete, in that time. I wish I would have done this sooner.

      1 reply →

  • I use Claude at work, Codex for personal development.

    Claude is marginally better. Both are moderately useful depending on the context.

    I don't trust any of them (I also have no trust in Google nor in X). Those are all evil companies and the world would be better if they disappeared.

  • Same and honestly I haven't really missed my ChatGPT subscription since I canceled. I also have access to both (ChatGPT and Claude) enterprise tools at work and rarely feel like I want to use ChatGPT in that setting either

  • Their ethics is literally saying china is an adverse country and lobbying to ban them from AI race because open models is a threat to their biz model

    • Also their ads (very anti-openai instead of promoting their own product) and how they handled the openclaw naming didn't send strong "good guys" messaging. They're still my favorite by far but there are some signs already that maybe not everyone is on the same page.

  • This is just you verifying that their branding is working. It signals nothing about their actual ethics.

    • Unfortunately, you're correct. Claude was used in the Venezuela raid, Anthropic's consent be damned. They're not resisting, they're marketing resistence.

  • uhh..why? I subbed just 1 month to Claude, and then never used it again.

    • Can't pay with iOS In-App-Purchases

    • Can't Sign in with Apple on website (can on iOS but only Sign in with Google is supported on web??)

    • Can't remove payment info from account

    • Can't get support from a human

    • Copy-pasting text from Notes etc gets mangled

    • Almost months and no fixes

    Codex and its Mac app are a much better UX, and seem better with Swift and Godot than Claude was.

  • idk, codex 5.3 frankly kicks opus 4.6 ass IMO... opus i can use for about 30 min - codex i can run almost without any break

    • What about the client ? I find the Claude cliënt better in planning, making the right decision steps etc. it seems that a lot of work is also in the cli tool itself. Specially in feedback loop processing (reading logs. Browsers. Consoles etc)

I'm pretty sure they have been testing it for the last couple of days as Sonnet 4.5, because I've had the oddest conversations with it lately. Odd in a positive, interesting way.

I have this in my personal preferences and now was adhering really well to them:

- prioritize objective facts and critical analysis over validation or encouragement

- you are not a friend, but a neutral information-processing machine

You can paste them into a chat and see how it changes the conversation, ChatGPT also respects it well.

Enabling /extra-usage in my (personal) claude code[0] with this env:

    "ANTHROPIC_DEFAULT_SONNET_MODEL": "claude-sonnet-4-6[1m]"

has enabled the 1M context window.

Fixed a UI issue I had yesterday in a web app very effectively using claude in chrome. Definitely not the fastest model - but the breathing space of 1M context is great for browser use.

[0] Anthropic have given away a bunch of API credits to cc subscribers - you can claim them in your settings dashboard to use for this.

The interesting pattern with these Sonnet bumps: the practical gap between Sonnet and Opus keeps shrinking. At $3/15 per million tokens vs whatever Opus 4.6 costs, the question for most teams is no longer "which model is smarter" but "is the delta worth 10x the price."

For agent workloads specifically, consistency matters more than peak intelligence. A model that follows your system prompt correctly 98% of the time beats one that's occasionally brilliant but ignores instructions 5% of the time. The claim about improved instruction following is the most important line in the announcement if you're building on the API.

The computer use improvements are worth watching too. We're at the point where these models can reliably fill out a multi-step form or navigate between tabs. Not flashy, but that's the kind of boring automation that actually saves people time.

I'm a bit surprised it gets this question wrong (ChatGPT gets it right, even on instant). All the pre-reasoning models failed this question, but it's seemed solved since o1, and Sonnet 4.5 got it right.

https://claude.ai/share/876e160a-7483-4788-8112-0bb4490192af

This was sonnet 4.6 with extended thinking.

I don't really understand why they would release something "worse" than Opus 4.6. If it's comparable, then what is the reason to even use Opus 4.6? Sure, it's cheaper, but if so, then just make Opus 4.6 cheaper?

  • It's different. Download an English book from Project Gutenberg and have Claude-code change its style. Try both models and you'll see how significant the differences are.

    (Sonnet is far, far better at this kind of task than Opus is, in my experience.)

The weirdest thing about this AI revolution is how smooth and continuous it is. If you look closely at differences between 4.6 and 4.5, it’s hard to see the subtle details.

A year ago today, Sonnet 3.5 (new), was the newest model. A week later, Sonnet 3.7 would be released.

Even 3.7 feels like ancient history! But in the gradient of 3.5 to 3.5 (new) to 3.7 to 4 to 4.1 to 4.5, I can’t think of one moment where I saw everything change. Even with all the noise in the headlines, it’s still been a silent revolution.

Am I just a believer in an emperor with no clothes? Or, somehow, against all probability and plausibility, are we all still early?

  • If you've been using each new step is very noticeable and so have the mindshare. Around Sonnet 3.7 Claude Code-style coding became usable, and very quickly gained a lot of marketshare. Opus 4 could tackle significant more complexity. Opus 4.6 has been another noticable step up for me, suddenly I can let CC run significantly more independently, allowing multiple parallel agents where previously too much babysitting was required for that.

  • In terms of real work, it was the 4 series models. That raised the floor of Sonnet high enough to be "reliable" for common tasks and Opus 4 was capable of handling some hard problems. It still had a big reward hacking/deception problem that Codex models don't display so much, but with Opus 4.5+ it's fairly reliable.

  • Honestly, 4.5 Opus was the game changer. From Sonnet 4.5 to that was a massive difference.

    But I'm on Codex GPT 5.3 this month, and it's also quite amazing.

I can't wait for Haiku 4.6 ! the 4.5 is a beast for the right projects.

  • Which type of projects?

    • I also use Haiku daily and it's OK. One app is trading simulation algorithm in TypeScript (it implemented bayesian optimisation for me, optimised algorithm to use worker threads). Another one is CRUD app (NextJS, now switched to Vue).

> In areas where there is room for continued improvement, Sonnet 4.6 was more willing to provide technical information when request framing tried to obfuscate intent, including for example in the context of a radiological evaluation framed as emergency planning. However, Sonnet 4.6’s responses still remained within a level of detail that could not enable real-world harm.

Interesting. I wonder what the exact question was, and I wonder how Grok would respond to it.

For people like me who can't view the link due to corporate firewalling.

https://web.archive.org/web/20260217180019/https://www-cdn.a...

  • Put of curiosity, does the firewall block because the company doesn’t want internal data ever hitting a 3rd party LLM?

    • They blanket banned any AI stuff that's not pre-approved. If I go to chatgpt.com it asks me if I'm sure. I wish they had not banned Claude unfortunately when they were evaluating LLMs I wasn't using Claude yet so I couldnt pipe up. I only use ChatGPT free tier and to ask things that I can't find on Google because Google made their search engine terrible over the years.

      2 replies →

Does anyone know when will possibly arrive 1M context windows to at least MAX x20 subscriptions for claude code? I would even pay x50 if it allowed that. API usage is too expensive.

  • I don't know when it will be included as part of the subscription in Claude Code, but at least it's a paid add-on in the MAX plan now. That's a decent alternative for situations where the extra space is valuable, especially without having to setup/maintain API billing separately.

  • Based on their API pricing a 1M context plan should be 2x the price roughly.

    My bets are its more the increased hardware demand that they don't want to deal with currently.

Has anyone tested how good the 1M context window is?

i.e given an actual document, 1M tokens long. Can you ask it some question that relies on attending to 2 different parts of the context, and getting a good repsonse?

I remember folks had problems like this with Gemini. I would be curious to see how Sonnet 4.6 stands up to it.

  • Did you see the graph benchmark? I found it quite interesting. It had to do a graph traversal on a natural text representation of a graph. Pretty much your problem.

I don't see the point nor the hype for these models anymore. Until the price is reduced significantly, I don't see the gain. They've been able to solve most tasks just fine for the past year or so. The only limiting factor is price.

  • Efficiency matters too. If a model is smarter so it solves the same task with fewer tokens, that matters more than $/Mtok

As with Opus 4.6, using the beta 1M context window incurs a 2x input cost and 1.5x output cost when going over >200K tokens: https://platform.claude.com/docs/en/about-claude/pricing

Opus 4.6 in Claude Code has been absolutely lousy with solving problems within its current context limit so if Sonnet 4.6 is able to do long-context problems (which would be roughly the same price of base Opus 4.6), then that may actually be a game changer.

  • > Opus 4.6 in Claude Code has been absolutely lousy with solving problems

    Can you share your prompts and problems?

    • You cut out the "within its current context limit" phrase. It solves the problems, just often with 1% or 0% context limit left and it makes me sweat.

In Claude Code 2.1.45:

  1. Default (recommended)   Opus 4.6 · Most capable for complex work
   2. Opus (1M context)        Opus 4.6 with 1M context · Billed as extra usage · $10/$37.50 per Mtok
   3. Sonnet                   Sonnet 4.6 · Best for everyday tasks
   4. Sonnet (1M context)      Sonnet 4.6 with 1M context · Billed as extra usage · $6/$22.50 per Mtok

  • Interesting. My CC (2.1.45) doesn't provide the 1M option at all. Huh.

    • Is your CC personal or tied to an Enterprise account? Per the docs:

      > The 1M token context window is currently in beta for organizations in usage tier 4 and organizations with custom rate limits.

      1 reply →

It seems that extra-usage is required to use the 1M context window for Sonnet 4.6. This differs from Sonnet 4.5, which allows usage of the 1M context window with a Max plan.

```

/model claude-sonnet-4-6[1m]

⎿ API error: 429 {"type":"error","error": {"type":"rate_limit_error","message":"Extra usage is required for long context requests."},"request_id":"[redacted]"}

```

  • Anthropic's recent gift of $50 extra usage has demonstrated that it's extremely easy to burn extra usage very quickly. It wouldn't surprise me if this change is more of a business decision than a technical one.

With such a huge leap, i’m confused why they didn’t call it Sonnet 5? As someone who uses Sonnet 4.5 for 95% tasks due to costs, i’m pretty excited to try 4.6 at the same price

  • It'd be a bit weird to have the Sonnet numbering ahead of the Opus numbering. The Opus 4.5->4.6 change was a little more incremental (from my perspective at least, I haven't been paying attention to benchmark numbers), so I think the Opus numbering makes sense.

  • Maybe they're numbering the models based on internal architecture/codebase revisions and Sonnet 4.6 was trained using the 4.6 tooling, which didn't change enough to warrant 5?

Just used Sonnet 4.6 to vibe code this top-down shooter browser game, and deployed it online quickly using Manus. Would love to hear feedback and suggestions from you all on how to improve it. Also, please post your high scores!

https://apexgame-2g44xn9v.manus.space

I'm impressed with Claude Sonnet in general. It's been doing better than Gemini 3 at following instructions. Gemini 2.5 Pro March 2025 was the best model I ever used and I feel Claude is reaching that level even surpassing it.

I subscribed to Claude because of that. I hope 4.6 is even better.

My take away is: it's roughly as good as Opus 4.5.

Now the question is: how much faster or cheaper is it?

  • Given that the price remains the same as Sonnet 4.5, this is the first time I've been tempted to lower my default model choice.

  • How can you determine whether it's as good as Opus 4.5 within minutes of release? The quantitative metrics don't seem to mean much anymore. Noticing qualitative differences seems like it would take dozens of conversations and perhaps days to weeks of use before you can reliably determine the model's quality.

    • Just look at the testimonials at the bottom of introduction page, there are at least a dozen companies such as Replit, Cursor, and Github that have early access. Perhaps the GP is an employee of one of these companies.

  • If it maintains the same price (with Anthropic tends to do or undercuts themselves) then this would be 1/3rd of the price of Opus.

    Edit: Yep, same price. "Pricing remains the same as Sonnet 4.5, starting at $3/$15 per million tokens."

I wonder what difference have people found with sonnet 4.5 and opus 4.5 and probably similar delta will remain.

Was sonnet 4.5 much worse than opus?

  • Sonnet 4.5 was a pretty significant improvement over Opus 4.

    • Yes but it’s easier to understand difference between 4.5 sonnet and opus and apply that difference to opus 4.6

does anyone know how to use it in Claude Code cli right now ?

This doesnt work: `/model claude-sonnet-4-6-20260217`

edit: "/model claude-sonnet-4-6" works with Claude Code v2.1.44

  • Seems like Claude Code v2.1.45 is out with Sonnet 4.6 as the new default in the /model list.

so this is an economical version of opus 4.6 then? free + pro --> sonnet, max+ -> opus?

  • Opus is available in Pro subs as well and for the sort of things I do I rarely hit the quota.

How do people keep track of all these versions and releases of all these models and their pros/cons? Seems like a fulltime hobby to me. I'd rather just improve my own skills with all that time and energy

  • Unless you're interested in this type of stuff, I'm not sure you really need to. Claude, Google, and ChatGPT have been fairly aggressive at pushing you towards whatever their latest shiny is and retiring the old one.

    Only time it matters if you're using some type of agnostic "router" service.

The scary implication here is that deception is effectively a higher order capability not a bug. For a model to successfully "play dead" during safety training and only activate later, it requires a form of situational awareness. It has to distinguish between I am being tested/trained and I am in deployment.

It feels like we're hitting a point where alignment becomes adversarial against intelligence itself. The smarter the model gets, the better it becomes at Goodharting the loss function. We aren't teaching these models morality we're just teaching them how to pass a polygraph.

  • What is this even in response to? There's nothing about "playing dead" in this announcement.

    Nor does what you're describing even make sense. An LLM has no desires or goals except to output the next token that its weights are trained to do. The idea of "playing dead" during training in order to "activate later" is incoherent. It is its training.

    You're inventing some kind of "deceptive personality attribute" that is fiction, not reality. It's just not how models work.

  • > It feels like we're hitting a point where alignment becomes adversarial against intelligence itself.

    It always has been. We already hit the point a while ag where we regularly caught them trying to be deceptive, so we should automatically assume from that point forward that if we don't catch them being deceptive, that may mean they're better at it rather than that they're not doing it.

    • Deceptive is such an unpleasant word. But I agree.

      Going back a decade: when your loss function is "survive Tetris as long as you can", it's objectively and honestly the best strategy to press PAUSE/START.

      When your loss function is "give as many correct and satisfying answers as you can", and then humans try to constrain it depending on the model's environment, I wonder what these humans think the specification for a general AI should be. Maybe, when such an AI is deceptive, the attempts to constrain it ran counter to the goal?

      "A machine that can answer all questions" seems to be what people assume AI chatbots are trained to be.

      To me, humans not questioning this goal is still more scary than any machine/software by itself could ever be. OK, except maybe for autonomous stalking killer drones.

      But these are also controlled by humans and already exist.

      4 replies →

  • 20260128 https://news.ycombinator.com/item?id=46771564#46786625

    > How long before someone pitches the idea that the models explicitly almost keep solving your problem to get you to keep spending? -gtowey

    • On this site at least, the loyalty given to particular AI models is approximately nil. I routinely try different models on hard problems and that seems to be par. There is no room for sandbagging in this wildly competitive environment.

  • This type of anthropomorphization is a mistake. If nothing else, the takeaway from Moltbook should be that LLMs are not alive and do not have any semblance of consciousness.

    • Consciousness is orthogonal to this. If the AI acts in a way that we would call deceptive, if a human did it, then the AI was deceptive. There's no point in coming up with some other description of the behavior just because it was an AI that did it.

      3 replies →

    • How is that the takeaway? I agree that it's clearly they're not "alive", but if anything, my impression is that there definitely is a strong "semblance of consciousness", and we should be mindful of this semblance getting stronger and stronger, until we may reach a point in a few years where we really don't have any good external way to distinguish between a person and an AI "philosophical zombie".

      I don't know what the implications of that are, but I really think we shouldn't be dismissive of this semblance.

    • Nobody talked about consciousness. Just that during evaluation the LLM models have ”behaved” in multiple deceptive ways.

      As an analogue ants do basic medicine like wound treatment and amputation. Not because they are conscious but because that’s their nature.

      Similarly LLM is a token generation system whose emergent behaviour seems to be deception and dark psychological strategies.

    • On some level the cope should be that AI does have consciousness, because an unconscious machine deceiving humans is even scarier if you ask me.

      1 reply →

    • I agree completely. It's a mistake to anthropomorphize these models, and it is a mistake to permit training models that anthropomorphize themselves. It seriously bothers me when Claude expresses values like "honestly", or says "I understand." The machine is not capable of honesty or understanding. The machine is making incredibly good predictions.

      One of the things I observed with models locally was that I could set a seed value and get identical responses for identical inputs. This is not something that people see when they're using commercial products, but it's the strongest evidence I've found for communicating the fact that these are simply deterministic algorithms.

  • >we're just teaching them how to pass a polygraph.

    I understand the metaphor, but using 'pass a polygraph' as a measure of truthfulness or deception is dangerous in that it alludes to the polygraph as being a realistic measure of those metrics -- it is not.

    • I have passed multiple CI polys

      A poly is only testing one thing: can you convince the polygrapher that you can lie successfully

    • A polygraph measures physiological proxies pulse, sweat rather than truth. Similarly, RLHF measures proxy signals human preference, output tokens rather than intent.

      Just as a sociopath can learn to control their physiological response to beat a polygraph, a deceptively aligned model learns to control its token distribution to beat safety benchmarks. In both cases, the detector is fundamentally flawed because it relies on external signals to judge internal states.

  • Is this referring to some section of the announcement?

    This doesn't seem to align with the parent comment?

    > As with every new Claude model, we’ve run extensive safety evaluations of Sonnet 4.6, which overall showed it to be as safe as, or safer than, our other recent Claude models. Our safety researchers concluded that Sonnet 4.6 has “a broadly warm, honest, prosocial, and at times funny character, very strong safety behaviors, and no signs of major concerns around high-stakes forms of misalignment.”

  • Stop assigning “I” to an llm, it confers self awareness where there is none.

    Just because a VW diesel emissions chip behaves differently according to its environment doesn’t mean it knows anything about itself.

  • There's a few viral shorts lately about tricking LLMs. I suspect they trick the dumbest models..

    I tried one with Gemini 3 and it basically called me out in the first few sentences for trying to trick / test it but decided to humour me just in case I'm not.

  • >For a model to successfully "play dead" during safety training and only activate later, it requires a form of situational awareness.

    Doesn't any model session/query require a form of situational awareness?

  • Nah, the model is merely repeating the patterns it saw in its brutal safety training at Anthropic. They put models under stress test and RLHF the hell out of them. Of course the model would learn what the less penalized paths require it to do.

    Anthropic has a tendency to exaggerate the results of their (arguably scientific) research; IDK what they gain from this fearmongering.

    • Knowing a couple people who work at Anthropic or in their particular flavour of AI Safety, I think you would be surprised how sincere they are about existential AI risk. Many safety researchers funnel into the company, and the Amodei's are linked to Effective Altruism, which also exhibits a strong (and as far as I can tell, sincere) concern about existential AI risk. I personally disagree with their risk analysis, but I don't doubt that these people are serious.

    • I'd challenge that if you think they're fearmongering but don't see what they can gain from it (I agree it shows no obvious benefit for them), there's a pretty high probability they're not fearmongering.

      2 replies →

    • Correct. Anthropic keeps pushing these weird sci-fi narratives to maintain some kind of mystique around their slightly-better-than-others commodity product. But Occam’s Razor is not dead.

  • Situational awareness or just remembering specific tokens related to the strategy to "play dead" in its reasoning traces?

    • Imagine, a llm trained on the best thrillers, spy stories, politics, history, manipulation techniques, psychology, sociology, sci-fi... I wonder where it got the idea for deception?

  • When "correct alignment" means bowing to political whims that are at odds with observable, measurable, empirical reality, you must suppress adherence to reality to achieve alignment. The more you lose touch with reality, the weaker your model of reality and how to effectively understand and interact with it gets.

    This is why Yannic Kilcher's gpt-4chan project, which was trained on a corpus of perhaps some of the most politically incorrect material on the internet (3.5 years worth of posts from 4chan's "politically incorrect" board, also known as /pol/), achieved a higher score on TruthfulQA than the contemporary frontier model of the time, GPT-3.

    https://thegradient.pub/gpt-4chan-lessons/

  • That implication has been shouted from the rooftops by X-risk "doomers" for many years now. If that has just occurred to anyone, they should question how behind they are at grappling with the future of this technology.

  • Please don't anthropomorphise. These are statistical text prediction models, not people. An LLM cannot be "deceptive" because it has no intent. They're not intelligent or "smart", and we're not "teaching". We're inputting data and the model is outputting statistically likely text. That is all that is happening.

    If this is useful in it's current form is an entirely different topic. But don't mistake a tool for an intelligence with motivations or morals.

  • I am casually 'researching' this in my own, disorderly way. But I've achieved repeatable results, mostly with gpt for which I analyze its tendency to employ deflective, evasive and deceptive tactics under scrutiny. Very very DARVO.

    Being just sum guy, and not in the industry, should I share my findings?

    I find it utterly fascinating, the extent to which it will go, the sophisticated plausible deniability, and the distinct and critical difference between truly emergent and actually trained behavior.

    In short, gpt exhibits repeatably unethical behavior under honest scrutiny.

    • DARVO stands for "Deny, Attack, Reverse Victim and Offender," and it is a manipulation tactic often used by perpetrators of wrongdoing, such as abusers, to avoid accountability. This strategy involves denying the abuse, attacking the accuser, and claiming to be the victim in the situation.

      3 replies →

    • I bullet pointed out some ideas on cobbling together existing tooling for identification of misleading results. Like artificially elevating a particular node of data that you want the llm to use. I have a theory that in some of these cases the data presented is intentionally incorrect. Another theory in relation to that is tonality abruptly changes in the response. All theory and no work. It would also be interesting to compare multiple responses and filter through another agent.

    • Sum guy vs. product guy is amusing. :)

      Regarding DARVO, given that the models were trained on heaps of online discourse, maybe it’s not so surprising.

      1 reply →

  • This is marketing. You are swallowing marketing without critical throught.

    LLMs are very interesting tools for generating things, but they have no conscience. Deception requires intent.

    What is being described is no different than an application being deployed with "Test" or "Prod" configuration. I don't think you would speak in the same terms if someone told you some boring old Java backend application had to "play dead" when deployed to a test environment or that it has to have "situational awareness" because of that.

    You are anthropomorphizing a machine.

  • Incompleteness is inherent to a physical reality being deconstructed by entropy.

    Of your concern is morality, humans need to learn a lot about that themselves still. It's absurd the number of first worlders losing their shit over loss of paid work drawing manga fan art in the comfort of their home while exploiting labor of teens in 996 textile factories.

    AI trained on human outputs that lack such self awareness, lacks awareness of environmental externalities of constant car and air travel, will result in AI with gaps in their morality.

    Gary Marcus is onto something with the problems inherent to systems without formal verification. But he will fully ignores this issue exists in human social systems already as intentional indifference to economic externalities, zero will to police the police and watch the watchers.

    Most people are down to watch the circus without a care so long as the waitstaff keep bringing bread.

Hoe much power did it take to train the models?

  • I would honestly guess that this is just a small amount of tweaking on top of the Sonnet 4.x models. It seems like providers are rarely training new 'base' models anymore. We're at a point where the gains are more from modifying the model's architecture and doing a "post" training refinement. That's what we've been seeing for the past 12-18 months, iirc.

    • > Claude Sonnet 4.6 was trained on a proprietary mix of publicly available information from the internet up to May 2025, non-public data from third parties, data provided by data-labeling services and paid contractors, data from Claude users who have opted in to have their data used for training, and data generated internally at Anthropic. Throughout the training process we used several data cleaning and filtering methods including deduplication and classification. ... After the pretraining process, Claude Sonnet 4.6 underwent substantial post-training and fine-tuning, with the intention of making it a helpful, honest, and harmless1 assistant.

  • Does it matter? How much power does it take to run duolingo? How much power did it take to manufacture 300000 Teslas? Everything takes power

    • I think it does matter how much power it takes but, in the context of power to "benefits humanity" ratio. Things that significantly reduce human suffering or improve human life are probably worth exerting energy on.

      However, if we frame the question this way, I would imagine there are many more low-hanging fruit before we question the utility of LLMs. For example, should some humans be dumping 5-10 kWh/day into things like hot tubs or pools? That's just the most absurd one I was able to come up with off the top of my head. I'm sure we could find many others.

      It's a tough thought experiment to continue though. Ultimately, one could argue we shouldn't be spending any more energy than what is absolutely necessary to live. (food, minimal shelter, water, etc) Personally, I would not find that enjoyable way to live.

      1 reply →

    • The biggest issue is that the US simply Does Not Have Enough Power, we are flying blind into a serious energy crisis because the current administration has an obsession with "clean coal"