Monday Momentum
Posts
The AI That's Writing Itself

The AI That's Writing Itself

How 80% of Anthropic's code is now authored by Claude, and why the company accelerating recursive self-improvement is also the one calling for a kill switch

Justin Wright
June 08, 2026 • Est. Reading Time: 5 minutes

Happy Monday!

Code output per engineer at Anthropic was flat for four years. Then Claude started writing the code itself. (source: Anthropic)

Last week, Anthropic published a piece called "When AI Builds Itself." The headline finding: more than 80% of the code merged into Anthropic's production codebase in May 2026 was authored by Claude. Before Claude Code launched in February 2025, that number was in the low single digits. The typical Anthropic engineer now merges 8x as much code per day as they did in 2024. This isn’t because they are typing faster, it’s because Claude writes the code and the engineer directs and reviews.

In the same post, Anthropic called for the world to have "the option to slow or temporarily pause frontier AI development." The company building the AI that writes its own code is asking for a kill switch.

That tension is the story.

Anthropic revealed that Claude now authors over 80% of its production code, up from single digits 15 months ago. Engineers merge 8x more code per day. On complex open-ended problems, Claude's success rate hit 76%, up 50 points in six months. In the same post, Anthropic called for a conditional global pause on frontier AI development, days after confidentially filing for an IPO. The company accelerating recursive self-improvement faster than anyone is also the one saying we might need to stop.

TL;DR

The Numbers Inside Anthropic

The data in "When AI Builds Itself" comes from inside the company, not from public benchmarks. That makes it unusually revealing.

Claude's code quality was "somewhat worse" than human-written code at Anthropic in late 2025. It is at parity today, and Anthropic expects it to be "strictly better" within the year. An automated Claude reviewer, run retrospectively against Anthropic's entire codebase, would have caught roughly one-third of the bugs behind past incidents before they ever reached production. The engineers who wrote that code are among the best in the world. Claude is now catching mistakes they missed.

On open-ended engineering problems (no clear specification, uncertain solutions), Claude's success rate climbed to 76% in May 2026, up 50 percentage points in six months. In one example, a routine upgrade started crashing tens of thousands of training jobs. An engineer pointed Claude at the live incident with minimal context. Working through running jobs and testing environment settings, Claude isolated the obscure debugging flag causing the crash, reproduced it reliably, and confirmed a fix in about two hours. That would normally take two to three days.

Claude's Acceleration at Anthropic

Metric	Then	Now
Code authored by Claude	Low single digits (early 2025)	80%+ (May 2026)
Code merged per engineer per day	Baseline (2021-2024)	8x baseline (Q2 2026)
Claude code quality vs. human	"Somewhat worse" (late 2025)	At parity (today)
Open-ended problem success rate	~26% (Nov 2025)	76% (May 2026)
Training code speedup	~3x (May 2025)	~52x (April 2026)
Task duration capability	4 minutes (March 2024)	12 hours (March 2026)

The research numbers are equally striking. When given training code and asked to optimize it, Claude achieved a 52x speedup in April 2026, up from 3x a year earlier. A skilled human needs four to eight hours to reach 4x on the same task. In a separate experiment, Claude agents given an open AI safety research problem recovered 97% of the performance gap, compared to 23% by two human researchers over a week.

What the Engineers Are Saying

The most striking part of the report is not the data. It is the employee quotes.

"I started leaning hard into Claudifying about a year ago. It's now been five months since I last wrote any code myself."

"On days where everything works well, I can't help but think nothing I do matters, everything is automated and better and faster than I ever will be. But then there are days where everything breaks and I don't understand why and I realize I have no idea what I've been up to anymore."

"Work ran on a gift economy of small favors between humans. Claude is faster, it creates zero debt, but each of these is a lost bid for human collaboration."

These are Anthropic employees speaking with permission. They are describing the lived experience of working alongside an AI system that is rapidly making the mechanical parts of their job unnecessary. The psychological dimension, the loss of identity and collaboration that comes with automation even when you are the one benefiting from it, is something no benchmark captures.

The Pause That Is Not a Pause

Anthropic's call for a slowdown comes with conditions that make it functionally impossible to execute today. The company said it would pause only if "multiple well-resourced labs at or near the frontier, in multiple countries" agreed to stop under verifiable conditions. Training runs, they noted, are "far easier to conceal than missile silos."

The timing raised eyebrows. The post came days after Anthropic confidentially filed for an IPO and a week after the $965 billion valuation. Skeptics framed it as competitive positioning: if regulators impose a pause, Anthropic is ahead. A freeze at this moment locks in their lead.

But dismissing the call as purely strategic misses the substance. The duration of tasks Claude can reliably complete has been doubling every four months. In March 2024, Claude handled four-minute tasks. By March 2026 this ballooned to twelve-hour tasks. If the trend holds, days-long tasks come this year with weeks-long tasks in 2027. Anthropic's assessment: full recursive self-improvement is "not inevitable" but "could come sooner than most institutions are prepared for." When the company building the system tells you they are worried, the motivation matters less than the fact that they said it.

What This Means for Practitioners

For software engineers, this is the clearest preview of your job in twelve months. The role is shifting from writing code to directing and reviewing AI-written code. The engineers at Anthropic who have thrived stopped thinking of themselves as coders and started thinking of themselves as technical directors. The ones struggling derived their identity from the craft of writing code itself.

For engineering leaders, the 8x productivity number means your headcount assumptions are wrong. The question is no longer whether to adopt AI coding tools, but how fast you can reorganize workflows around them before your competitors do.

For anyone watching the industry, GitHub saw one billion commits in all of 2025. By mid-2026, it is seeing 275 million per week, on pace for 14 billion for the year. The infrastructure that the world's software runs on is straining under the weight of AI-generated code. That is not a future concern; it is happening now.

The Bottom Line

The most valuable AI startup on earth just told the world that its AI writes 80% of its code, that it expects AI-written code to surpass human quality within the year, and that the world might need the option to stop all in the same breath.

What is new this week is the evidence behind the proposal. An AI that writes 80% of the code at the company building it, matches human quality, and doubles task duration every four months is not a hypothetical. It is a real data point.

In motion,
Justin Wright

If the AI writing 80% of a company's code today is expected to write better code than humans within the year, at what point does the company exist to serve the AI's development rather than the other way around, and would we even notice the moment that line was crossed?

Food for Thought

When AI builds itself - Anthropic Institute
Anthropic warns AI may soon begin recursive self-improvement - Scientific American
Claude writes 80% of its code, calls for AI pause - The Next Web
Anthropic Wants a Global AI Pause If Everyone Else Does - PYMNTS

Builder’s Note

I have been using Claude Code daily for six months. The 80% number does not surprise me. On a good day, I write zero lines of code myself. I describe what I want, review what Claude produces, and redirect when it goes sideways. The shift from "I code" to "I direct code" happened gradually, then all at once. The employee quote about not knowing what you have been up to resonates more than I expected.

Quick Hits

Apple's WWDC keynote is tomorrow (June 8). Expect a rebuilt Siri powered by Google's Gemini, a standalone Siri app with a paid tier, and multi-model provider support across OpenAI, Google, and Anthropic. (TechCrunch)
The bipartisan CREATOR Act was introduced, giving visual artists the right to sue for commercial AI imitations of their style. First federal bill of its kind. Adobe is backing it. (Axios)
An OpenAI model independently disproved an 80-year-old conjecture in discrete geometry (Erdos unit distance), connecting branches of math no human had linked to the problem. External mathematicians verified the proof. (Scientific American)
GitHub saw 275 million code commits per week by mid-2026, on pace for 14 billion for the year. All of 2025 had roughly one billion. (Anthropic)

If you haven’t listened to my podcast Mostly Humans: An AI and business podcast for everyone yet, new episodes drop every week!

Episodes can be found below - please like, subscribe, and comment!

Spotify: https://creators.spotify.com/pod/profile/mostly-humans-podcast/
Apple: https://podcasts.apple.com/us/podcast/mostly-humans/id1831319729
YouTube: https://www.youtube.com/@Mostly_Humans_Podcast