Comment by hgoel

1 day ago

They do tend to make a lot of noise about it for the PR, but at the same time the actual safety research they present seems to be relatively grounded in practical reality, e.g. the quote someone posted here about how the Mythos model apparently has a tendency to try to bypass safety systems if they get in the way of what it has been asked to do.

Sure, a big part of this is PR about how smart their model apparently is, but the failure mode they're describing is also pretty relevant for deploying LLM-based systems.

0 comments

hgoel

No comments yet

Contribute on Hacker News ↗