Comment by zamalek
17 hours ago
> Probably more interesting
It is widely suspected that self-inflicted "bad news" ("Mythos is so dangerous we just can't give the public access to it") is nothing more than Dario's typical style of marketing - keep in mind that they have an IPO coming up, because he certainly factors that into everything he says in public (as is his responsibility, to be fair).
An alternative reason for delaying the model might not be "we are trying to make it safe." It could be "we don't know how to host this thing at scale, or cost-effectively".
GPT 5.5 has already been shown to be as adept as Mythos at finding vulnerabilities.
Finally, laymen massively underestimate the importance of the harness for model performance. OpenHands existed long before Claude Code, Claude Code changed everything because of the clever hand-holding it does. Mythos is definitely more than just a model.
One capability that I see is missing from opus is this ability to understand an entire system. My hope is that a mythos class model will be able to comprehend even something as complicated as an IOT system with a hardware and firmware layer multiple API’s backend and different kinds of API and web clients.
The main limitation we’ve had to agentic coding is an understanding of this system that spans processes running on different machines and architectures.
Interesting — I haven't seen that problem, and I do have a system that has different APIs, web clients, non-web clients and embedded clients.
What sort of clever handholding does Claude code do?
https://github.com/Piebald-AI/claude-code-system-prompts
It's interesting that (for example for the explore agent https://github.com/Piebald-AI/claude-code-system-prompts/blo... ) they use a personality "you are a file search specialist" and "your strengths" framing. I thought that was largely thought to be useless, or even counterproductive nowadays? Does anyone know more about this stuff?
There's also things that have since been discovered:
* Ralph Wiggum loops
* Simply not allowing an agent to stop its turn until all tasks are marked as done
* Sub agents over worktrees
* Context compression