Cast your mind way back to March of 2024, almost a century in AI years. That was when Microsoft announced plans for an astounding $100 billion investment in AI data centers. (The projected cost for Stargate has since been bumped up to $500 billion.)
At that time I was asked “what could possibly make AI worth $100 billion to Microsoft?” My answer: imagine if Microsoft could find and fix ALL of the vulnerabilities in its software? That would easily be worth $100 billion. Instead of being the root cause of all of our security issues Microsoft could be the secure option for software products. The total annual cost imposed on the world for Patch Tuesday is estimated to be as high as $225 billion according to a model I had ChatGPT construct. Should Microsoft spend $100 billion to save its customers $225 billion a year? Yes.
An AI with those capabilities is here. Only it is not an OpenAI model, it is Claude Mythos Preview from Anthropic. Download the 244 page system card here.
It’s April 9. The latest models from OpenAI and Anthropic were released in February. They were created using earlier versions of themselves. The Intelligence Explosion is happening.
Mythos is in “Preview” because in developing the model to be the next generation of general purpose LLMs Anthropic realized it was very good at discovering vulnerabilities in code and chaining together multiple vulns to create sophisticated exploits.
A couple of examples:
Somehow take a blind SQL injection vulnerability, one that executes arbitrary SQL commands but returns nothing, and use it to gain control of an account. Watch Nicholas Carlini from Anthropic describe this at the [un]prompted conference in San Francisco. He is probably talking about Mythos but it was still under wraps at the time.
From Anthropic’s announcement of Project Glasswing.
Mythos Preview found a 27-year-old vulnerability in OpenBSD—which has a reputation as one of the most security-hardened operating systems in the world and is used to run firewalls and other critical infrastructure. The vulnerability allowed an attacker to remotely crash any machine running the operating system just by connecting to it;
Project Glasswing is Anthropic’s stopgap solution to releasing a model so powerful that it can own any software anywhere. Only certain partners are allowed to play with it.
Over the past few weeks, we have used Claude Mythos Preview to identify thousands of zero-day vulnerabilities
Thousands.
Does anyone have the infrastructure to deal with thousands of new zero-days? Can the scanners keep up? Can the vulnerability enrichment solutions keep up? Imagine you are going to be patching every single app in your org each and every day. That is where this is going.
This feels like 2010. Virus signatures had exploded from a handful a week to 1,000 a day to 30,000 to 60,000 per day. The AV vendors were pushing signature updates six times a day. The model broke and the industry collapsed (read up on the demise of Symantec and McAfee and the consolidation of AV vendors.)
We already know that bug research has been turned on its head by the use of AI. What happens when thousands of researchers use models like Mythos to discover new vulnerabilities? Do software companies (think Oracle, SAP, SFDC) even employ enough people to address all the disclosed vulnerabilities coming their way? Can they create and push patches fast enough?
Let’s say “over the past few weeks” means seven weeks and “thousands of zero-days” means 2,000. That is 285 new zero-days a week or roughly 15,000 a year. In 2025 there were 48,000 new CVEs cataloged. One team can now increase the total annual CVEs by 30%. What happens when thousands of researchers are discovering thousands of zero-days every few weeks? There are currently 360K CVEs. How can all of the systems scale to 3.6 million CVEs?
A whole bunch of things are going to break. Vulnerability Management writ large is going to break.
Anthropic has pushed ahead of the other AI labs for now. They are responsibly withholding Mythos Preview from general availability. But even the current models from all the labs are great at finding and exploiting vulnerabilities. The next models will be upon us shortly and its time to think about what this means for cybersecurity.
If you want to dig into both sides of the reaction to Mythos Preview watch these two pundits.
First the OMG this changes everything reaction:
Then the “ho-hum AI is just a stochastic prediction of the next token” view.
Enjoy.
Update: Nimitt Jhaveri pointed out in the comments an Axios scoop that reveals OpenAI is rolling out their own security tool. https://www.axios.com/2026/04/09/openai-new-model-cyber-mythos-anthopic
Update 2. AISLE™ tested the Mythos vuln discoveries on older models. 8/8 found the BSD vuln:
We tested Anthropic Mythos’s showcase vulnerabilities on small, cheap, open-weights models. They recovered much of the same analysis. AI cybersecurity capability is very jagged: it doesn’t scale smoothly with model size, and the moat is the system into which deep security expertise is built, not the model itself. Mythos validates the approach but it does not settle it yet.
https://aisle.com/blog/ai-cybersecurity-after-mythos-the-jagged-frontier
Phoenix Security also uses existing models to discover and chain vulns. https://phoenix.security/claude-code-leak-to-vulnerability-three-cves-in-claude-code-cli-and-the-chain-that-connects-them/
Update 4-25-2026
XBOW ran benchmarks for vulnerability discovery against the latest models. GPT 5.5 (OpenAI’s just released model) blows the roof off their testing. Black box is fuzzing against compiled code (DAST). White box is against source code (SAST).
Bottom line: GPT-5.5 raises the floor in black box testing and blows past the ceiling in white box testing.



