Monday Momentum
Posts
The Vertical Integration Playbook

The Vertical Integration Playbook

Why Sam Altman is worried and why Google's chip stack is the real story

Justin Wright
November 24, 2025 • Est. Reading Time: 11 minutes

Happy Monday!

Here's what nobody's saying about Google's AI comeback: Gemini 3's benchmarks are impressive, but the chips those models train on might be more impressive.

Last week, a leaked internal memo revealed Sam Altman warning OpenAI employees about "rough vibes" and "economic headwinds" from Google's resurgence. He admitted Google is "doing excellent work in every aspect" particularly in pre-training. That's a stunning concession from someone who spent 2023 watching Google embarrass itself with Bard's botched launch.

But here's what Altman's really worried about: OpenAI is burning billions renting compute from Microsoft. Google manufactures its own chips at cost. That's a structural disadvantage.

Google quietly built the only viable alternative to Nvidia's GPU monopoly and used it to train models that now lead nearly every major benchmark. Gemini 3 Pro scored 1501 on LMArena, crushing GPT-5.1. It tripled competitive performance on ARC-AGI-2, jumping from the mid-teens to 45%. It leads on reasoning, coding, multimodal understanding, and factual accuracy.

The technical achievement matters, but the economic model matters more.

Google went from AI laughingstock in early 2023 to industry leader in November 2025 by doubling down on pre-training fundamentals and building custom silicon to support it. While OpenAI projects $74B in operating losses by 2028 renting compute, Google owns Trillium TPUs, Ironwood TPUs, and Axion CPUs. This gives them full-stack control from chip to model. The AI race isn't determined by who ships the best model this month. It's about who can afford to keep shipping. Custom silicon is the new dividing line: companies that own their compute stack (Google, Apple, Amazon) vs. those renting it (OpenAI, Anthropic, everyone else). Distribution compounds the advantage with Gemini 3 deployed instantly to 2 billion AI Overview users and 650 million Gemini app users.

TL;DR

The Pre-Training Comeback Nobody Saw Coming

In mid-2024, the consensus was clear: pre-training was exhausted. Marginal gains were getting harder. The future was post-training techniques, reasoning models, RLHF, and inference-time compute.

Google ignored the consensus.

While competitors pursued post-training optimizations, Google went back to first principles. They bet that pre-training still had room to run if you had the infrastructure to support it.

Gemini 3's success validates that bet. Sam Altman's memo explicitly credits Google's pre-training breakthroughs. "Google has been doing excellent work recently in every aspect," he wrote, singling out their pre-training methodology.

What changed? Google stopped treating pre-training as solved and started treating it as engineering-constrained. The bottleneck, in their opinion, was the hardware running it.

That's where the chip stack enters the story.

Vertical Integration: The Advantage Hiding in Plain Sight

Google's been building custom AI chips for a decade. TPU v1 launched in 2015, years before anyone thought specialized AI accelerators were necessary. The bet seemed questionable at the time. Nvidia GPUs were the standard. Why reinvent the wheel?

Ten years later, that "questionable bet" looks like the most important strategic decision in AI infrastructure.

Here's the current Google silicon stack:

Trillium (TPU v6): 4.7x peak compute performance per chip vs. v5e. Double the HBM capacity and bandwidth. Scales to 256 chips per pod, tens of thousands per supercomputer. 67% more energy-efficient. Generally available now.

Ironwood (TPU v7): 10x performance improvement over v5p. 4x better per-chip performance than Trillium. Scales to 9,216 chips in a single superpod with double the performance-per-watt of Trillium. Generally available starting now.

Axion CPUs: Google's first custom Arm-based processor. Up to 65% better price-performance than x86 alternatives. Handles general-purpose workloads like data prep, microservices, and web serving that all complement AI training and inference.

When you own the chip design, you can optimize it specifically for your models. Google designed Trillium with Gemini's architecture in mind, doubling HBM capacity because they knew their models would need it. That's impossible when you're buying off-the-shelf GPUs designed for everyone's workloads.

Cost matters even more. OpenAI pays Microsoft's markup on Azure compute, which itself pays Nvidia's markup on H100’s. Google manufactures TPU’s at cost. That margin advantage compounds over millions of training runs.

And distribution? Google deployed Gemini 3 to 2 billion monthly AI Overview users and 650 million Gemini app users instantly. No API partnerships. No developer evangelism. Just a flip of the switch for massive distribution.

Compare that to OpenAI's strategy: rent expensive compute, build models, convince developers to integrate via API, and hope enterprise adoption scales faster than burn rate.

The Economic Reality No One's Discussing

Sam Altman's memo wasn't motivated by benchmark anxiety. It was motivated by financial reality.

OpenAI projects $74 billion in operating losses by 2028. That's not a typo. Seventy-four billion dollars. Anthropic, by comparison, is on a conservative path to break even by 2028 with a focus on enterprise customers.

Meanwhile, Google's AI business is profitable. They manufacture chips, train models on owned infrastructure, and deploy instantly to products with billions of users. The economic model is categorically different.

Altman knows this. That's why the memo focuses on "staying focused through short-term competitive pressure" and betting on "very ambitious" long-term projects even if it means falling "temporarily behind in the current regime."

Translation: We can't match Google's cost structure, so we're betting everything on superintelligence arriving before we run out of runway. Maybe that works, but it's not a strategy you'd choose if you had alternatives.

The other shoe dropping: Google isn't stopping at Ironwood. TPU’s are now on a rapid annual cadence. Each generation compounds the advantage. Trillium made pre-training breakthroughs economically viable. Ironwood makes them cheaper and faster. TPU v8 will make them even faster still.

OpenAI doesn't control this variable. They're along for the ride on whatever Microsoft negotiates with Nvidia (or increasingly, on whatever Microsoft's own Maia chips can deliver). That's not a terrible position, but it's reactive instead of proactive.

And here's the kicker: Google's chip advantage is increasing, not decreasing. The performance gap between Trillium and TPU v5e is larger than previous generation-over-generation jumps. Ironwood's 10x improvement over v5p is unprecedented. Their chip progress is accelerating.

What This Means If You're Building in AI

The vertical integration divide is the new fault line in AI.

Companies with custom silicon like Google, Apple, Amazon, and Meta can iterate faster, train cheaper, and deploy at scale without negotiating with intermediaries. Companies renting compute face structural cost disadvantages that worsen as models scale.

If you're building on OpenAI's API, you're exposed to their burn rate. If they need to raise prices to hit profitability targets, you're along for the ride. If they can't compete on cost with Google's integrated offering, your product's margin compresses.

If you're building on Google's stack (Vertex AI, Gemini APIs, TPU access) you're tapping into a cost structure that compounds in your favor as Google ships new chip generations.

This doesn't mean OpenAI loses or Google wins permanently. It means the playing field isn't level. The companies that own their infrastructure can subsidize AI products with profits from other businesses while optimizing the entire stack. The pure-play AI companies can't.

The Bottom Line

Two years ago, Google issued a "code red" over ChatGPT. Bard's launch was a disaster. The company seemed hopelessly behind, paralyzed by bureaucracy while OpenAI moved fast and broke things.

Today, Gemini 3 leads every major benchmark. Google deploys to billions of users instantly. And Sam Altman is warning employees about rough vibes ahead. What changed was the realization that renting compute creates a structural ceiling.

Google spent a decade building TPU’s when the ROI seemed questionable. That bet is now paying off in ways that go beyond benchmarks. It's about who can afford to keep training models as they scale to trillions of parameters. It's about who can iterate on chip design and model architecture simultaneously. It's about who can deploy instantly to captive distribution.

Vertical integration isn't sexy. Custom silicon takes years to develop. But in a capital-intensive arms race, owning the infrastructure beats renting it every time. Just like Apple’s massive leaps forward in personal computing power with their native silicon, Google is now taking advantage of similar progress.

The AI wars aren't won by the best model this month. They're won by whoever can keep shipping the best model, every month, without going bankrupt.

Right now? That's Google.

In motion,
Justin Wright

If pre-training breakthroughs require custom silicon that only a handful of companies can afford to build, and those same companies control distribution to billions of users, are we watching the AI industry consolidate before it even matures?

Food for Thought

Altman Memo Forecasts 'Rough Vibes' Due to Resurgent Google - The Information/The Decoder
Google Announces Gemini 3 Surpassing OpenAI's GPT-5.1 Across Key AI Benchmarks - Neowin
Gemini 3 vs. GPT-5.1 vs. Claude 4.5: Benchmarks Reveal Google's New AI Leads - Vertu
Google Unveils Gemini 3 Claiming the Lead in Math, Science, Multimodal, and Agentic AI Benchmarks - VentureBeat
Trillium Sixth-Generation TPU Is in Preview - Google Cloud Blog
Ironwood TPUs and New Axion-Based VMs for Your AI Workloads - Google Cloud Blog
TPU Transformation: A Look Back at 10 Years of Our AI-Specialized Chips - Google Cloud Blog
Why Google Keeps Building Custom Silicon: The Story Behind Axion - Google Cloud Blog
Google Announces Sixth-Generation AI Chip, a TPU Called Trillium - HPCwire
AI Chips: Google's Ironwood TPU and New Axion Chips Are Now Generally Available - WinBuzzer

I am excited to officially announce the launch of my podcast Mostly Humans: An AI and business podcast for everyone!

Episodes can be found below - please like, subscribe, and comment!

Spotify: https://creators.spotify.com/pod/profile/mostly-humans-podcast/
Apple: https://podcasts.apple.com/us/podcast/mostly-humans/id1831319729
YouTube: https://www.youtube.com/@Mostly_Humans_Podcast