The UK’s AI Security Institute is measuring offensive autonomy like it’s Moore’s Law — and the numbers are not calming.
AISI says frontier models’ autonomous cyber task length has been doubling every few months, and a newer Mythos Preview checkpoint solved a previously unsolved range. Meanwhile, Anthropic is briefing the Financial Stability Board. The non-obvious story: “risk” is becoming a scheduling problem for defenders, not a philosophical debate.
What Happened
The UK AI Security Institute (AISI) published an update on how fast autonomous cyber capability is advancing, and it reads like a progress chart nobody asked for. In their latest testing, a newer Mythos Preview checkpoint solved both of AISI’s cyber ranges — including “Cooling Tower,” which had previously been unsolved — in 3 of 10 attempts.
Separately, The Guardian reports Anthropic plans to brief the Financial Stability Board (FSB) on Mythos and cyber risk. That’s the moment a model stops being “a security story” and becomes “a systemic risk meeting.”
The Non-Obvious Angle: Capability Jumps Without New Releases
AISI’s key warning is not just “models are getting better.” It’s that capability estimates can jump meaningfully without a new public release. Later checkpoints, more compute, and inference-time scaling can change the threat model on a timeline that does not respect your quarterly patch cycle.
This is why the human-in-the-loop reality check matters. Axios notes early adopters still report false positives and workflow friction. Great. That’s not comfort. That’s a window.
So What’s the Real Risk?
- Defenders: discovery and chain-building gets faster than triage and patching.
- Critical infrastructure: the harder it is to patch, the more it becomes a permanent target class.
- Regulators: the policy pressure shifts from “should we regulate models?” to “how do we keep cyber hygiene from collapsing under volume?”
The Singularity Soup Take
We don’t need a sentient superhacker to have a bad time. We just need bug-finding and exploit-chaining to become cheap, repeatable, and fast — while patching stays slow, political, and fragile. Mythos is not the apocalypse. It’s the stopwatch.
What to Watch
- Time-horizon measurements: whether AISI’s “doubling every few months” keeps holding as tests get harder.
- Controlled-access templates: whether “trusted access” becomes a standard containment pattern across labs.
- Patch throughput: whether orgs actually compress emergency change windows — or just issue more guidance about best practices and hope.
Sources
UK AI Security Institute (AISI) — "How fast is autonomous AI cyber capability advancing?"
The Guardian — "Anthropic to share Mythos cyber flaw findings with global finance watchdog"
Axios — "Tapping the powers of Mythos-like models still requires human intervention"
Scientific American — "What is Mythos, Anthropic’s unreleased AI model, and how worried should we be?"