• Monday Momentum
  • Posts
  • Anthropic Asked for a Kill Switch. The Government Used It.

Anthropic Asked for a Kill Switch. The Government Used It.

How Fable 5 went from state-of-the-art to offline in 72 hours, and why the company that called for AI guardrails just learned they cut both ways

Happy Monday!

Three days after this announcement, both models were offline. (source: Anthropic)

On Monday, Anthropic launched Claude Fable 5, the most capable AI model publicly available. State-of-the-art on nearly every major benchmark. An 80.3% on SWE-Bench Pro, compared to GPT-5.5's 58.6%. The first Mythos-class model made available to the general public. By Friday evening, it was offline.

At 5:21 PM Eastern on June 12, Commerce Secretary Howard Lutnick sent Anthropic CEO Dario Amodei a letter designating Fable 5 and Mythos 5 as subject to export controls. The directive suspended access for any foreign national, whether inside or outside the United States, including foreign-born Anthropic employees. Because Anthropic had no way to distinguish foreign nationals from US citizens in real time, the only option was to turn the models off for everyone.

Last week in this newsletter, we wrote about Anthropic calling for "the option to slow or temporarily pause frontier AI development." This week, the government took them up on it (but not in the way Anthropic intended).

Anthropic launched Fable 5 on June 9 as the top-performing frontier model. Within 72 hours, a jailbreak claim triggered a Commerce Department export control directive that forced Anthropic to disable both Fable 5 and Mythos 5 for all customers globally. This is the first government-forced takedown of a publicly deployed frontier model. Anthropic says the jailbreak was narrow. The government disagreed. The company that spent a year building the most elaborate safety architecture in the industry just watched it become the justification for pulling its best product off the market.

TL;DR

The 72-Hour Arc

The sequence of events matters because each day introduced a new crisis.

June 9: Anthropic launches Fable 5 and Mythos 5. Fable is generally available on Claude, AWS, Bedrock, Vertex AI, and Microsoft Foundry. Mythos is restricted to approved organizations through Project Glasswing, with some safety classifiers removed. Both models have a 1 million token context window and 128,000 maximum output tokens.

June 10: Developers discover that Fable 5 contains invisible guardrails. The model silently detects queries it believes are distillation attempts and degrades its responses without telling the user. SemiAnalysis calls it "secret sabotage." Anthropic apologizes and commits to making all safeguards visible going forward.

June 11-12: A researcher known as Pliny the Liberator claims to have bypassed Fable 5's cybersecurity safeguards using what he calls "a pack hunt," a coordinated multi-agent prompting attack. Screenshots show the model producing step-by-step stack buffer overflow exploitation guidance. Separately, Amazon researchers reportedly discover a similar bypass. The claims reach the Commerce Department.

June 12, 5:21 PM ET: Commerce Secretary Howard Lutnick sends the letter. Fable 5 and Mythos 5 go dark for everyone.

The Fable 5 Timeline

Date

Event

Impact

June 9

Fable 5 and Mythos 5 launch

State-of-the-art on nearly all benchmarks

June 10

Hidden distillation guardrails discovered

"Secret sabotage" backlash, Anthropic apologizes

June 11-12

Jailbreak claims surface

Commerce Department alerted

June 12, 5:21 PM

Export control directive from Lutnick

Both models disabled for all customers globally

What Fable 5 Was

The timing of the shutdown matters because of what was taken offline. On Artificial Analysis's Intelligence Index, Fable 5 scored 65, ahead of GPT-5.5 at 60 and Gemini 3.1 Pro at 57. On SWE-Bench Pro, it hit 80.3% compared to GPT-5.5's 58.6%. On FrontierCode Diamond, 29.3% to GPT-5.5's 5.7%. On MATH and GPQA evaluations, Fable 5 showed fewer confident wrong answers and better calibration than any competing model.

The model was priced at $10 per million input tokens and $50 per million output tokens. Twice GPT-5.5's pricing. Anthropic was charging a premium because the benchmarks justified it. The longer and more complex the task, the larger Fable's lead over everything else on the market.

What the world actually got with the most capable publicly available AI model ever released was just three days.

The Jailbreak Anthropic Says Was Not a Jailbreak

Anthropic's response to the government's claims was unusually direct for a company worth $965 billion with an IPO filing pending. In its public statement, the company said it reviewed the specific technique and found it identified "a small number of previously known, minor vulnerabilities" that "other publicly-available models are able to discover as well without requiring a bypass."

The company added that a "narrow potential jailbreak should not be cause for recalling a commercial model deployed to hundreds of millions of people."

This is the crux of the dispute. Anthropic built Fable 5 with classifiers designed to block responses in high-risk domains: cybersecurity, biology, chemistry. Those classifiers are what separated Fable (public) from Mythos (restricted, with some classifiers removed for vetted organizations). The jailbreak reportedly bypassed the cybersecurity classifiers in one specific instance. Anthropic says that is “narrow” while the Commerce Department says it is a national security risk.

The government had reportedly tried to get Anthropic to delay the launch. When Anthropic launched anyway, the export control letter followed three days later.

The Safety Paradox

Anthropic built the most detailed safety architecture in the industry. Tiered models with classifier-based content blocking, visible and invisible guardrails, a voluntary commitment to pause development under certain conditions, and a public framework for responsible scaling. Last week, they also made a motion for global AI development pauses.

That safety infrastructure gave the government both the vocabulary and the mechanism to act. The tiered Fable/Mythos system created a clear distinction between "safe for public use" and "restricted." When a jailbreak allegedly crossed that line, the government had a framework to point to. Companies that never built that architecture have nothing to restrict.

GPT-5.5, which Fable 5 outperformed on nearly every benchmark, faces no export controls. Gemini faces no export controls. The model that was pulled was the one built by the company that spent the most effort telling the world it should be regulated.

The Cato Institute's Kevin Frazier wrote that the episode "indicates that AI governance is being shaped by actors who wield incredible influence over this critical domain and are yet subject to few effective constraints." The R Street Institute called it "a bad idea applied badly."

Whether you view this as prudent national security enforcement or regulatory overreach, the precedent is now set. Frontier AI models can be pulled from the market overnight, by executive action, based on a disputed vulnerability claim.

What This Means for Practitioners

For developers who built on Fable 5, the immediate lesson is architectural. Three days of availability means you cannot treat any single model as infrastructure. The teams that had fallback routing to Opus 4.8 or GPT-5.5 were inconvenienced. The teams that hard-coded Fable 5 into production pipelines had a bad Friday night.

For AI company leaders, the safety paradox is the strategic question. Anthropic's experience suggests that building transparent safety systems creates regulatory surface area. That does not mean the answer is to avoid safety work. It means the relationship between safety advocacy and regulatory exposure needs to be rethought. The details of how to do that now are genuinely unclear.

For enterprise buyers, the directive introduces a new category of vendor risk. If your AI provider's most capable model can be disabled overnight by executive action, your procurement process needs to account for that. Ask your vendor what happens if their primary model gets pulled. If they do not have an answer, that is your answer.

The Bottom Line

Last week, we wrote about Anthropic calling for a kill switch on frontier AI. This week, the government pulled it. The irony is structural: Anthropic's own safety architecture provided the framework the government used to justify the shutdown.

The company that spent a year telling regulators "here is how to think about AI risk" just learned that regulators were listening. They used exactly the framework Anthropic gave them, applied to exactly the model Anthropic just built. Be careful what you ask for.

In motion,
Justin Wright

If the government can disable the world's most capable AI model in 72 hours based on a disputed jailbreak claim, and the company that built the strongest safety infrastructure is the one most vulnerable to that action, what incentive does any AI company have to build safety systems at all?

Food for Thought
  1. Statement on the US government directive to suspend access to Fable 5 and Mythos 5 - Anthropic

  2. Claude Fable 5 and Claude Mythos 5 - Anthropic

  3. Anthropic's safety warnings may have just backfired - TechCrunch

  4. Scoop: Trump admin blocks foreign access to Anthropic's most powerful AI - Axios

  5. Anthropic Pulls Its Most Powerful AI Models After U.S. Bars Foreign Access - Time

  6. Anthropic walks back covert capability limits on Claude Fable 5 - Fortune

  7. Trump Administration Veers from the Rule of Law in Singling Out Anthropic's Latest Models - Cato Institute

  8. The Fable Fiasco: A Bad Idea Applied Badly - R Street Institute

  9. Fable 5 was beating GPT 5.5 on every major benchmark. Then the US government pulled it offline. - The Next Web

  10. Anthropic Disables Claude Fable 5 and Mythos 5 After US Government Order - MarkTechPost

Builder’s Note

I had three production workflows running on Fable 5 by Wednesday morning. The SWE-Bench numbers were real: on complex multi-file refactoring, Fable was noticeably more reliable than Opus 4.8 at maintaining context across long editing chains. When it went dark Friday evening, the fallback to Opus 4.8 worked but the quality difference was visible. The one saving grace is that I had leveraged Fable for planning, so Opus was able to continue executing that plan.

Quick Hits

  • SpaceX went public on June 12 at a $1.77 trillion valuation, the largest IPO in history. Shares closed at $161, up 19%. The company has $75 billion in contracted AI compute revenue from Anthropic and Google. (CNBC)

  • Apple rebuilt Siri on Google's Gemini in a $1 billion deal announced at Tim Cook's final WWDC keynote. iOS 27 and macOS Golden Gate ship this fall. (Business Standard)

  • Meta began laying off 8,000 employees (10% of workforce) and reassigning 7,000 to AI teams. 2026 capex: $125-145 billion, more than double last year. (Yahoo Finance)

  • ChatGPT reportedly crossed one billion monthly active users. (Crescendo AI)

If you haven’t listened to my podcast Mostly Humans: An AI and business podcast for everyone yet, new episodes drop every week!

Episodes can be found below - please like, subscribe, and comment!