How Project Glasswing and the Claude Mythos Preview are rewriting the rules of cybersecurity — before the rules rewrite themselves.
The glasswing butterfly, Greta oto, survives by being invisible. Its wings are almost entirely transparent — no scales, no color, just clear membrane — which means predators look right through it. It hides in plain sight. Anthropic named its most consequential cybersecurity initiative after this creature, and the metaphor is apt in more ways than one.
Hidden inside the world’s most critical software — operating systems, web browsers, hospital networks, financial infrastructure, energy grids — are thousands of vulnerabilities no human ever found. Not because they didn’t look. Because the bugs were that subtle. Hiding in plain sight, sometimes for decades, waiting.
On April 7, 2026, Anthropic announced that it had built a model capable of finding them. And it decided that the only responsible thing to do was to arm the defenders before the attackers found out.

The Model That Started as a Leak
The story of Claude Mythos Preview didn’t begin with a press release. It began with a blunder.
In late March 2026, internal Anthropic documents were found sitting in a publicly accessible data cache — a misconfiguration of the company’s content management system, later attributed to human error by Anthropic. Among the exposed files was a draft blog post describing a new, unreleased model then codenamed “Capybara” — described therein as “by far the most powerful AI model we’ve ever developed” and a new tier of Claude entirely, “larger and more intelligent than our Opus models.”
The draft reportedly warned that the model was “currently far ahead of any other AI model in cyber capabilities” and that it “presages an upcoming wave of models that can exploit vulnerabilities in ways that far outpace the efforts of defenders.”
The leak was awkward. It was also, in a way, the clearest possible signal of what was coming.
Weeks later, Anthropic made it official. The model — renamed Claude Mythos Preview, from the ancient Greek for “utterance” or “narrative” — was formally announced not as a product launch, but as a warning and a call to arms. It would not be released to the public. Instead, a coalition of the world’s most important technology companies would be given private access to use it for one purpose: to find and fix vulnerabilities before the bad actors do.
That coalition is Project Glasswing.
A Reckoning, Not a Release
Anthropic CEO Dario Amodei was direct in the Project Glasswing launch video. “Claude Mythos Preview is a particularly big jump,” he said, according to Wired’s reporting. “We haven’t trained it specifically to be good at cyber. We trained it to be good at code, but as a side effect of being good at code, it’s also good at cyber.” He added: “More powerful models are going to come from us and from others. And so we do need a plan to respond to this.”
Anthropic’s Chief Science Officer Jared Kaplan was equally stark in an interview with The New York Times: “The goal is both to raise awareness and to give good actors a head start on the process of securing open-source and private infrastructure and code.”
And Logan Graham, head of Anthropic’s frontier red team — the internal group that stress-tests models for dangerous capabilities before release — described the moment plainly in an interview with Wired: “The real message is that this is not about the model or Anthropic. We need to prepare now for a world where these capabilities are broadly available in 6, 12, 24 months.”
To understand why the industry took this seriously, you have to understand what Mythos Preview actually did during testing.
What the Model Found — and How It Found It
Over the past several weeks before the announcement, Anthropic’s frontier red team set Mythos Preview loose on some of the world’s most important and most carefully scrutinized codebases. The results, detailed on Anthropic’s red team blog, were striking.
The model found thousands of zero-day vulnerabilities — previously unknown bugs that even the software’s own developers didn’t know existed — across every major operating system and every major web browser, plus a range of other critical software. Many were severe. Many had survived years or decades of human review and automated testing.
Three examples illustrate the scale of the discovery:
A 27-year-old bug in OpenBSD. OpenBSD is widely regarded as one of the most security-hardened operating systems in existence — it is used in firewalls, secure routers, and critical network infrastructure. Mythos Preview found a vulnerability in it that allowed an attacker to remotely crash any machine running the OS simply by connecting to it. The bug was 27 years old. It had never been found before. It is now patched.
A 16-year-old flaw in FFmpeg. FFmpeg is used by an enormous range of software to handle video encoding and decoding. The specific line of code containing the vulnerability had been hit by automated testing tools five million times without triggering a detection. Mythos found it on its own. It is now patched.
A Linux kernel privilege escalation chain. Mythos autonomously discovered and chained together multiple vulnerabilities in the Linux kernel — the software running most of the world’s servers — allowing an attacker to escalate from ordinary user access to complete control of the machine. It is now patched.
What made these findings especially remarkable was not just what was found, but how. According to Anthropic’s red team post, nearly all of these vulnerabilities were identified entirely autonomously — without any human steering after the initial prompt asking it to find a security vulnerability. The model would read the code, hypothesize potential flaws, run the project in an isolated container to test its suspicions, iterate with debuggers, and output a complete bug report with a proof-of-concept exploit.
Graham told the Washington Examiner that the model had found “tens of thousands” of high-risk vulnerabilities in total, adding: “If we are crossing the Rubicon where you can functionally automate those capabilities and make them very cheap as well, then we’re in an entirely new world.”

The Exploit Problem
Finding bugs is one thing. Building exploits — the tools used to actually weaponize a vulnerability — is another entirely, and historically much harder. This is where Mythos Preview represents the most alarming leap.
As Anthropic’s red team blog notes, just one month earlier, the company had written publicly that its prior model, Opus 4.6, was “currently far better at identifying and fixing vulnerabilities than at exploiting them” — with a near-zero success rate at autonomous exploit development.
Mythos Preview is not in the same category.
In one test, Anthropic researchers took the same set of Firefox 147 JavaScript engine vulnerabilities — all subsequently patched in Firefox 148 — and asked both models to turn them into working exploits. Opus 4.6 succeeded twice out of several hundred attempts. Mythos Preview succeeded 181 times, and achieved register control on 29 additional attempts.
The exploits it generates are not simple, either. The red team blog describes a browser exploit that chained four separate vulnerabilities, involving a complex JIT heap spray that escaped both the browser’s renderer sandbox and the operating system’s sandbox. In another case, it autonomously wrote a remote code execution exploit against FreeBSD’s NFS server — granting full root access to unauthenticated users — by splitting a 20-gadget ROP chain across multiple network packets.
Perhaps most unsettling: Anthropic engineers with no formal security training have asked Mythos to find remote code execution vulnerabilities overnight and woken up the next morning to a complete, working exploit. The barrier to entry for sophisticated cyberattacks is collapsing.
Why Not Just… Not Release It?
This is the obvious question, and Anthropic has a clear answer.
“As the slogan goes, this is the least capable model we’ll have access to in the future,” Kaplan told The New York Times. The underlying capabilities that give Mythos its cybersecurity power are downstream of general improvements in coding, reasoning, and autonomy — not a special feature that can be trained away. Other labs will build models with comparable capabilities. The question isn’t whether the capability exists; it’s who gets to it first.
Anthropic’s strategy draws an explicit parallel to coordinated vulnerability disclosure — the established practice of giving software developers time to patch a bug before it’s disclosed publicly. The company is applying the same logic at a civilizational scale: give the defenders a meaningful head start before the tools are widely available.
Graham described the logic plainly in Wired: “We’ve seen Mythos Preview accomplish things that a senior security researcher would be able to accomplish. This has very big implications then for how capabilities like this should be released. Done not carefully, this could be a meaningfully accelerant for attackers.”
He also acknowledged the stakes of getting it wrong: “There are a lot of really critical systems around the world, whether it’s physical infrastructure or things that protect your personal data, that are running on old versions of code. If these previously were mostly secure because it took a lot of human effort to attack them, does that paradigm of security even work anymore?”
The Coalition: Who’s In the Room
Anthropic assembled 12 founding partners for Project Glasswing — a coalition spanning cloud infrastructure, enterprise software, hardware, financial services, and open-source maintenance. Reuters confirmed the full list includes Amazon Web Services, Apple, Broadcom, Cisco, CrowdStrike, Google, JPMorganChase, the Linux Foundation, Microsoft, NVIDIA, and Palo Alto Networks. An additional 40+ organizations that build or maintain critical software infrastructure also have access through the program.
The statements from partners were notably urgent — not the polished corporate endorsements typical of partnership launches.
Elia Zaitsev, CTO of CrowdStrike, said: “The window between a vulnerability being discovered and being exploited by an adversary has collapsed — what once took months now happens in minutes with AI. That is not a reason to slow down; it’s a reason to move together, faster.”
Anthony Grieco, SVP and Chief Security Officer of Cisco, said: “AI capabilities have crossed a threshold that fundamentally changes the urgency required to protect critical infrastructure from cyber threats, and there is no going back. The old ways of hardening systems are no longer sufficient.”
Igor Tsyganskiy, EVP of Cybersecurity at Microsoft, said that when Mythos Preview was tested against CTI-REALM — Microsoft’s own open-source security benchmark — it “showed substantial improvements compared to previous models.”
Jim Zemlin, CEO of the Linux Foundation, highlighted the stakes for open source: “Open source software constitutes the vast majority of code in modern systems, including the very systems AI agents use to write new software.” Historically, he noted, open-source maintainers have been “left to figure out security on their own.” Glasswing, he argued, is a “credible path to changing that equation.”
Even Google — a direct Anthropic competitor in the AI space — joined. “It’s always been critical that the industry work together on emerging security issues,” said Heather Adkins, Google’s VP of Security Engineering.
The breadth of the coalition is itself a signal: even rivals agree the risk is real enough to cooperate.
The Money Behind the Mission
Anthropic is putting significant resources behind the initiative. Per the official announcement:
- $100 million in model usage credits committed to Project Glasswing participants
- $2.5 million donated to Alpha-Omega and the OpenSSF through the Linux Foundation
- $1.5 million donated to the Apache Software Foundation
- An additional $4 million in total direct donations to open-source security organizations
Open-source software maintainers who are not part of the founding coalition can apply for access through the Claude for Open Source program.
After the research preview period, Mythos Preview will be available to participants at $25 per million input tokens and $125 per million output tokens, accessible through the Claude API, Amazon Bedrock, Google Cloud’s Vertex AI, and Microsoft Foundry.
The Benchmark Story: A Model in a Different League
Beyond the real-world vulnerability findings, the performance benchmarks tell their own story. According to Anthropic, Mythos Preview represents a substantial leap over Claude Opus 4.6 across every dimension measured:
On CyberGym (cybersecurity vulnerability reproduction): Mythos Preview scores 83.1% vs. Opus 4.6’s 66.6%.
On SWE-bench Verified (real-world software engineering tasks): 93.9% vs. 80.8%.
On Humanity’s Last Exam (expert-level academic questions, without tools): 56.8% vs. 40.0%.
On GPQA Diamond (graduate-level science reasoning): 94.6% vs. 91.3%.
Crucially, Mythos Preview scored 86.9% on BrowseComp while using 4.9× fewer tokens than Opus 4.6’s 83.7% — suggesting not just greater capability but greater efficiency.
These benchmarks, per Anthropic’s red team researchers, now mostly saturate — Mythos has largely outgrown them. That’s why the team turned to real-world zero-day discovery as the true measure of capability: if a model finds a bug that has never appeared in training data, the discovery is, by definition, genuine reasoning, not recall.
The Safety Strategy — and Its Tensions
Anthropic has been explicit: Claude Mythos Preview will not be released to the general public. But the company is also clear that “our eventual goal is to enable our users to safely deploy Mythos-class models at scale.”
The path to that goal runs through a carefully staged safeguard development process. Per the official announcement, Anthropic plans to launch new cybersecurity safeguards with an upcoming Claude Opus model — testing and refining them with a model that “does not pose the same level of risk as Mythos Preview” — before eventually enabling broader access to Mythos-class capabilities. Security professionals whose legitimate work is affected by those safeguards will be able to apply to an upcoming Cyber Verification Program.
This position puts Anthropic in the company’s now-familiar paradox: it is simultaneously building the most capable models it has ever built, openly advertising that those models could be dangerous, and asking the world to trust its judgment about how and when to release them. As The New York Times noted, “Anthropic occupies an unusual position in today’s AI landscape. It is racing to build increasingly powerful AI systems, and making billions of dollars selling access to those systems, while also drawing attention to the risks its technology poses.”
The analogy Anthropic draws is to software fuzzers — tools like AFL that, when first widely deployed, raised concerns about enabling attackers to identify vulnerabilities faster. They did. But they became foundational defensive infrastructure. The red team blog argues the same trajectory will eventually hold for AI: “In the long term, we expect it will be defenders who will more efficiently direct resources and use these models to fix bugs before new code ever ships.” The danger lies in the transitional period between now and that equilibrium.
The Government Dimension
Notably, The Verge reported that Anthropic has “briefed senior officials in the US government about Mythos and what it can do” — despite the company’s ongoing legal dispute with the Trump administration. The Pentagon designated Anthropic a supply-chain risk earlier this year after the company refused to permit its AI to be used for autonomous targeting or surveillance of U.S. citizens; a federal judge subsequently blocked that designation from taking effect. Anthropic has since filed two lawsuits challenging the designation.
The Washington Examiner reported exclusively that a separate Anthropic official confirmed the company is in active discussions with federal agencies about deploying Mythos to protect critical infrastructure from adversaries. “The fact that cyber is a part of even active warfare, and a very common part of active warfare, I think, underscores its importance,” Logan Graham said.
The company stated in its official release that securing critical infrastructure is “an important security priority for democratic states” and that “the US and its allies must maintain a decisive lead in AI technology.”
What Comes Next
Project Glasswing is explicitly framed as a beginning, not an end. Anthropic committed to publishing a public report within 90 days detailing what was learned, which vulnerabilities were fixed, and what improvements were made. The company also intends to produce a set of practical recommendations for how security practices should evolve — covering everything from vulnerability disclosure processes and patching automation to open-source supply-chain security and standards for regulated industries.
The longer-term vision calls for an independent, third-party body bringing together private and public sector organizations to house continued large-scale cybersecurity projects.
But the most urgent message may be the simplest. As Kaplan told the Times: “As the slogan goes, this is the least capable model we’ll have access to in the future.”
Graham put it even more directly in Wired: “Project Glasswing is the starting point. It will fail if it’s just a handful of companies using a model. It has to grow into something even larger.”
The glasswing butterfly survives because its transparency is a kind of armor. What can’t be seen can’t be targeted. For decades, the world’s most dangerous software vulnerabilities operated on the same principle — invisible, embedded in billions of lines of code, protected by sheer complexity.
Claude Mythos Preview can see through it all.
The question now is whether the defenders can move faster than the attackers — and whether the institutions, the incentives, and the political will exist to make that possible. Project Glasswing is Anthropic’s bet that they can. It is, as Dario Amodei put it, a plan. Whether it’s enough of one remains to be seen.






Comments 1