The Great AI Reality Check

A16z's top 100 gen AI apps, and why Benedict Evans thinks AI is more like the iPhone than the Industrial Revolution

Happy Monday!

Last week, I explored how Meta's strategic pivot from talent wars to partnerships signals a new phase in AI development. But while we analyze corporate strategies, a more fundamental question emerges: after two and a half years of generative AI, what's actually working in the real world?

A16z's fifth edition of the Top 100 Gen AI Consumer Apps provides the answer, and it's not what the hype suggests. Only 14 companies have maintained top rankings for over two years, daily AI usage remains around 10% despite universal enthusiasm, and Chinese apps dominate categories where US companies were supposed to lead.

Former A16z partner Benedict Evans predicted this reality: GenAI is a classic platform shift: core tech will feel commoditized; real moats come from distribution, integration, UX, and where the money flows, not just who has the flashiest base model.

The data validates his framework perfectly. While everyone obsessed over model benchmarks, value quietly migrated to exactly where Evans said it would: applications, distribution, and integration versus foundation models.

A16z's Top 100 Gen AI Apps data reveals the gap between AI hype and reality: only 14 companies sustained top rankings for 2+ years, daily usage remains low despite universal excitement, and value is accruing to distribution and applications rather than foundation models. This validates Benedict Evans' prediction that AI follows classic platform shift patterns where "commodity core, differentiated layers" dynamics determine winners, not model superiority.

TL;DR

The Ecosystem Reality Check

Two and a half years into the generative AI revolution, the data tells a sobering story. A16z's latest ranking shows the ecosystem is "starting to stabilize" with only 11 new entrants in the latest report compared to 17 in the previous edition. This isn't the explosive growth narrative we've been hearing, pointing instead to maturation and consolidation.

Benedict Evans warned about exactly this pattern: Early certainty that 'this matters,' paired with deep uncertainty about where value is captured. The uncertainty is resolving, and the winners aren't who we expected.

The most telling statistic: only 14 companies have appeared in every A16z ranking over two years. These "All Stars" represent the true survivors of AI's consumer adoption reality. Even more revealing: of these 14 companies, only 5 have proprietary models, while 7 use API-available models from others, and 2 are model aggregators.

This validates Evans' core thesis that consumer LLMs feel interchangeable. In a blind test across leading models, most users couldn't tell which replied. In these cases brand, distribution, and embedding in workflows matter more.

The Meta Trend: From Model Wars to Distribution Wars

While the AI industry fixated on benchmark competitions and parameter counts, the real battle was happening at the distribution layer. A16z's data reveals this starkly: Google managed to place 4 products in the top 100 despite being perceived as "behind" in the AI race, while Meta's advanced models resulted in Meta AI ranking only #46 on web and missing the mobile cut entirely.

Google's success validates Evans' prediction about a "default reset" where the risk isn't immediate margin compression so much as users reconsidering which box they type into. Google's Gemini captured #2 position with 12% of ChatGPT's web visits, while AI Studio debuted in the top 10 and NotebookLM ranked #13 after going viral.

Meanwhile, X's Grok demonstrated the power of integrated distribution, jumping from "cold start" with no app to 20M monthly active users. By comparison, Meta's supposedly superior models failed to gain traction in consumer markets. According to Evans, distribution matters far more than marginal model wins. Companies should invest in embedding AI where work already happens and leverage brand trust.

Pattern Recognition: Where Value Actually Accrues

Pattern #1: Application Layer Dominance

The A16z "All Stars" reveal where sustained value accumulates: image generation (Midjourney, Leonardo), productivity tools (Photoroom, Gamma, Quillbot), and specialized applications (Eleven Labs for voice, Veed for editing). These companies built defensible businesses by focusing on specific use cases rather than general chat interfaces.

Evans predicted this outcome. Instead of just leveraging chat, ship opinionated workflows inside existing tools. This reduces cognitive load and makes weekly users daily users.

Pattern #2: The Aggregator Advantage

Model aggregators like Poe and HuggingFace rank among the "All Stars" despite having no proprietary models. This validates Evans' insight that commodity core, differentiated layers dynamics would emerge; the models become infrastructure while platforms capture user value.

Pattern #3: Integration Over Innovation

The surprise winner category is "vibe coding" platforms like Lovable and Replit that integrate AI into development workflows. These platforms show "revenue retention upwards of 100% for several months post-signup", proving that workflow integration creates stickier value than standalone chat interfaces. Evans believes the key is to design for workflows where work is already happening.

Contrarian Take: The China Factor Reveals True Competition Dynamics

The most shocking revelation in A16z's data: 22 of 50 mobile apps were developed in China, with 7 Chinese companies ranking in the web top 20. This reveals fundamental competitive advantages that Western companies missed.

Chinese video generation models have dominated not because of superior AI research, but because "there are fewer IP regulations (with likely training on copyrighted data)". Google's Veo 3 was "the first U.S. model to break this trend" by being "partially trained on YouTube data".

Evans also believes that tighter control slow ecosystems and advantage shifts to jurisdictions that “get out of the way.” While regulation is important, we must be vary not to stifle innovation in an industry that changes by the day.

The China success story validates that regulatory environment, data access, and go-to-market strategy matter more than pure model capabilities. Companies like Meitu placed 5 apps in the top 100 by focusing on specific user needs (photo editing) rather than trying to build general-purpose AI.

The Bigger Picture: Platform Economics in Action

A16z's data reveals classic platform economics playing out exactly as Evans predicted. The ecosystem is stabilizing around proven applications, newcomers face higher barriers to entry, and value is migrating to companies that control distribution and user workflows.

The Reality of User Adoption: Despite universal hype, daily active consumer use remains around 10%, with another slice using AI weekly and many having "tried and bounced." Evans noted that “blank-chat unstructured use is hard" while integrated features will drive mass adoption.

The Margin Migration: Value is flowing away from foundation models toward applications that solve specific problems. The companies with sustained success focus on "opinionated workflows" rather than general-purpose chat.

The Network Effects Mystery: Evans identified a crucial gap: unlike search/social/OS, there's no proven feedback loop where more users lead to a meaningfully better base model. This explains why foundation model companies struggle to build sustainable moats while application companies with clear user value propositions dominate the rankings.

The Distribution Reality: Google's 4 products in the top 100 vs. Meta's 1 struggling product demonstrates that platform distribution advantages matter more than model quality. Much to Evans’ point about the iPhone, it remains the “best glowing rectangle”; ecosystem integration can still anchor value.

GenAI is a classic platform shift: core tech will begin to feel commoditized while real moats come from distribution, integration, and user experience.

In motion,
Justin Wright

If only 14 companies have sustained top AI app rankings for over two years, and most successful AI companies use others' models rather than building their own, does this suggest the hundreds of AI startups building foundation models are solving the wrong problem?

Food for Thought
  1. AI stethoscope can detect three heart conditions in 15 seconds (British Heart Foundation)

  2. AI advancement shows promise for technology to assist immobilized individuals (EurekAlert!)

  3. Anthropic raises $13B Series F (Anthropic)

  4. OpenAI acquires Statsig and appoints new CTO of Applications (OpenAI)

  5. Google gets to keep Chrome (The Verge)

  6. Apple’s rumored AI search tool for Siri could rely on Google (The Verge)

I am excited to officially announce the launch of my podcast Mostly Humans: An AI and business podcast for everyone!

Episodes can be found below - please like, subscribe, and comment!