← Back to context

Comment by throwaw12

2 days ago

are we cooked yet?

Benchmarks look very impressive! even if they're flawed, it still translates to real world improvements

People say we're cooked every single day. The only response is to continue life as if we aren't. When we are, you won't have to ask that question.

Yep, I think the lede might be buried here and we're probably cooked (assuming you mean SWEs, but the writing has been on the wall for 4 months.)

I guess I'm still excited. What's my new profession going to be? Longer term, are we going to solve diseases and aging? Or are the ranks going to thin from 10B to 10000 trillionaires and world-scale con-artist misanthropes plus their concubines?

  • Your new profession will be attempting to find enough gig work to eat. You will also be competing with self-driving taxis, so there's that as well.

    • I need to start SaaS for getting people to start doing lunges and squats so they can carry others around on their back, I need a founding engineer, a founding marketer, and 100m hard currency.

  • If wealth becomes too captured at the top, the working class become unable to be profitably exploited - squeezing blood from a stone.

    When that happens, the ultra wealthy dynasties begin turning on each other. Happens frequently throughout history - WWI the last example.

    Your options become choosing a trillionaire to swear fealty to and fight in their wars hoping your side wins, or I guess trying to walk away and scrape out a living somewhere not worth paying attention to.

    Or, I suppose, revolution, but the last one with persistent success was led by Mao and required throwing literally millions of peasants against walls of rifles. Not sure it'd work against drones.

There is an entire section on crafting chemical/bio weapons so yeah I think we are cooked.

  • There's been a section on this in nearly every system card anthropic has published so this isn't a new thing - and, this model doesn't have particularly higher risk than past models either:

    > 2.1.3.2 On chemical and biological risks

    > We believe that Mythos Preview does not pass this threshold due to its noted limitations in open-ended scientific reasoning, strategic judgment, and hypothesis triage. As such, we consider the uplift of threat actors without the ability to develop such weapons to be limited (with uncertainty about the extent to which weapons development by threat actors with existing expertise may be accelerated), even if we were to release the model for general availability. The overall picture is similar to the one from our most recent Risk Report.

  • LLMs are useless for this type of thing for the same reason that the Anarchist Cookbook has always been. The skills required to convert text into complicated reactions completing as intended (without killing yourself) is an art that's never actually written down anywhere, merely passed orally from generation to generation. Impossible for LLMs to learn stuff that's not written down.

    This is the same reason why LLMs are not doing well at science in general - the tricky part of doing scientific research (indeed almost all of the process) never gets written down, so LLMs cannot learn it.

    Imagine if we never preserved source code, just preserved the compiled output and started from scratch every time we wrote a new version of a program. No Github, just marketing fluff webpages describing what software actually did. Libraries only available as object code with terse API descriptions. Imagine how shit LLMs would be at SWE if that was the training corpus...