There's been a section on this in nearly every system card anthropic has published so this isn't a new thing - and, this model doesn't have particularly higher risk than past models either:
> 2.1.3.2 On chemical and biological risks
> We believe that Mythos Preview does not pass this threshold due to its noted limitations in
open-ended scientific reasoning, strategic judgment, and hypothesis triage. As such, we
consider the uplift of threat actors without the ability to develop such weapons to be
limited (with uncertainty about the extent to which weapons development by threat actors
with existing expertise may be accelerated), even if we were to release the model for
general availability. The overall picture is similar to the one from our most recent Risk
Report.
LLMs are useless for this type of thing for the same reason that the Anarchist Cookbook has always been. The skills required to convert text into complicated reactions completing as intended (without killing yourself) is an art that's never actually written down anywhere, merely passed orally from generation to generation. Impossible for LLMs to learn stuff that's not written down.
This is the same reason why LLMs are not doing well at science in general - the tricky part of doing scientific research (indeed almost all of the process) never gets written down, so LLMs cannot learn it.
Imagine if we never preserved source code, just preserved the compiled output and started from scratch every time we wrote a new version of a program. No Github, just marketing fluff webpages describing what software actually did. Libraries only available as object code with terse API descriptions. Imagine how shit LLMs would be at SWE if that was the training corpus...
There's been a section on this in nearly every system card anthropic has published so this isn't a new thing - and, this model doesn't have particularly higher risk than past models either:
> 2.1.3.2 On chemical and biological risks
> We believe that Mythos Preview does not pass this threshold due to its noted limitations in open-ended scientific reasoning, strategic judgment, and hypothesis triage. As such, we consider the uplift of threat actors without the ability to develop such weapons to be limited (with uncertainty about the extent to which weapons development by threat actors with existing expertise may be accelerated), even if we were to release the model for general availability. The overall picture is similar to the one from our most recent Risk Report.
LLMs are useless for this type of thing for the same reason that the Anarchist Cookbook has always been. The skills required to convert text into complicated reactions completing as intended (without killing yourself) is an art that's never actually written down anywhere, merely passed orally from generation to generation. Impossible for LLMs to learn stuff that's not written down.
This is the same reason why LLMs are not doing well at science in general - the tricky part of doing scientific research (indeed almost all of the process) never gets written down, so LLMs cannot learn it.
Imagine if we never preserved source code, just preserved the compiled output and started from scratch every time we wrote a new version of a program. No Github, just marketing fluff webpages describing what software actually did. Libraries only available as object code with terse API descriptions. Imagine how shit LLMs would be at SWE if that was the training corpus...
There's still RL