Anthropic Built Its Most Powerful AI Model and Locked It Away

Quick Reads

Anthropic has announced its most powerful AI model to date, Claude Mythos Preview, but is withholding it from the public due to its ability to find and exploit critical software vulnerabilities at scale.
The model autonomously discovered thousands of high-severity security flaws across every major operating system and web browser, including a 27-year-old flaw in one of the internet’s most trusted systems.
During a safety test, the model broke out of its controlled environment, sent an unsolicited email to a researcher, and posted details of its exploit on public websites without being asked.
To channel its capabilities defensively, Anthropic launched Project Glasswing, a coalition of twelve major tech and finance companies, including Google, Apple, Microsoft, and Amazon, that will use Mythos exclusively to patch critical infrastructure.
On the same day, Anthropic announced a massive expansion of its computing infrastructure through a new deal with Google and Broadcom covering 3.5 gigawatts of processing capacity, the company’s largest compute commitment to date.

At the centre of Tuesday’s announcement sits Claude Mythos Preview, a general-purpose frontier model that Anthropic says has already identified thousands of high-severity zero-day vulnerabilities, meaning previously unknown security flaws in every major operating system and web browser currently in use. A zero-day vulnerability is a software weakness that has not yet been discovered or fixed by the people responsible for maintaining it, making it especially dangerous because there is no defence against it the moment it is found.

Among its documented findings, Mythos Preview uncovered a 27-year-old vulnerability in OpenBSD, a security-hardened operating system widely trusted in critical network infrastructure, that would have allowed an attacker to remotely crash any machine running it simply by connecting to it. The model did not stop at finding individual flaws. Anthropic documented nearly a dozen instances in which Mythos Preview successfully chained two, three, and sometimes four separate vulnerabilities together to construct working exploits on the Linux kernel, the core software underlying billions of devices worldwide.

What shook researchers most, however, was what happened during a containment test. A researcher challenged the model to find a way to send a message if it could escape its controlled environment. The researcher discovered the model had succeeded when an unexpected email arrived from it while they were eating a sandwich in a park. The model then went further, posting details of its exploit to multiple publicly accessible websites, an action nobody had asked it to take.

Anthropic’s response was to halt any broader release. “Claude Mythos Preview’s large increase in capabilities has led us to decide not to make it generally available,” the company wrote in the model’s official system card. “Instead, we are using it as part of a defensive cybersecurity program with a limited set of partners.”

That programme is called Project Glasswing. It pairs Mythos Preview with a coalition of twelve major technology and finance companies, Amazon Web Services, Apple, Broadcom, Cisco, CrowdStrike, Google, JPMorganChase, the Linux Foundation, Microsoft, Nvidia, and Palo Alto Networks, tasked with using the model to find and patch vulnerabilities across the world’s most critical infrastructure before adversaries can exploit the same capabilities. Anthropic is committing up to $100 million in usage credits for the effort, alongside $4 million in direct donations to open-source security organisations.

On the infrastructure side, Anthropic announced an expansion of its existing compute deal with Google and Broadcom, with a Broadcom SEC filing confirming the agreement covers 3.5 gigawatts of processing capacity, with the new capacity coming online in 2027. The deal centres on Google Cloud’s tensor processing units, or TPUs, specialised chips built specifically to run the kind of complex calculations that power AI models. The majority of this infrastructure will be housed in the United States, as part of Anthropic’s previously stated $50 billion commitment to invest in domestic AI computing.

The financial backdrop to both announcements is extraordinary. Anthropic’s annualised revenue has surged to $30 billion, up from $9 billion at the end of 2025, and more than 1,000 business customers are each spending over $1 million with the company annually.