Andrej Karpathy | Dwarkesh Patel | Why AI Progress Feels Slower Than Promised (And Why That's Actually Good News)

OpenAI co-founder and Tesla AI lead Andre Karpathy breaks down why AI agents need a decade, not a year. Learn what really works now and what doesn't.

Andrej Karpathy | Dwarkesh Patel | Why AI Progress Feels Slower Than Promised (And Why That's Actually Good News)
The decade of agents, not the year. Here's what a leading AI researcher learned from building both self-driving cars and language models.

Who Is Andre Karpathy?

If you've used AI to write code in the last year, you're using ideas that Andre Karpathy helped pioneer.
He's the person who coined "vibe coding" - that style of working where you describe what you want in plain English and let AI generate the implementation. The same approach everyone from solo founders to Fortune 500 companies now uses daily.
If you don’t know, “Vibe Coding” is the word of the year 2025!
And it started with this simple tweet:
But that's just the most visible piece.
His actual resume:
  • Co-built GPT-2 at OpenAI (the model that made people realize LLMs were real)
  • Led Tesla's self-driving AI team for 5 years (2017-2022)
  • Studied under Geoffrey Hinton at University of Toronto (the godfather of deep learning)
  • Created CS231n, Stanford's legendary computer vision course
  • Built educational tools like micrograd and nanoGPT used by thousands of developers
He's also one of the rare people who's been in the trenches for both:
  • The deep learning revolution (training neural networks when it was still niche)
  • The LLM explosion (watching language models go from research toy to ubiquitous tool)
  • Real-world deployment (shipping AI that had to actually work, not just demo well)
Now he's building Eureka Labs - trying to solve education for the AI age.
Why listen to him?
Because he's watched AI predictions fail for 15 years. He's seen demos turn into decade-long slogs. He's lived through multiple "AI is about to change everything" moments that... didn't quite pan out on the predicted timeline.
And he has the scars, stories, and pattern recognition to know what's actually hard versus what just looks hard.
This isn't hype. This is someone who's been building the future, got burned by overoptimism, and came back with frameworks that actually map to reality.

The Demo-to-Product Gap

I spent years watching demos that looked like magic turn into products that barely worked.
At Tesla, I saw Waymo give a perfect self-driving demo in 2014. Perfect. Zero interventions. I thought we were done.
We weren't even close.
Andre Karpathy, who led Tesla's self-driving efforts and helped build GPT-2, has a brutal truth about technology progress: Every "nine" of reliability takes the same amount of work.
  • 90% working = First demo (everyone gets excited)
  • 99% working = Useful product (some people adopt)
  • 99.9% working = Scalable solution (actually changes the world)
  • 99.99% working = Safety-critical system (self-driving, medical AI)
The problem? Each improvement feels identical in effort. Going from 90% to 99% takes as long as going from 99% to 99.9%.
This is why self-driving took 40 years, not 4.

The Intelligence We're Actually Building

Here's where it gets weird.
We're not building animals. We're building ghosts.
Animals = Evolved through billions of years, hardware baked into DNA
AI = Trained on internet documents, mimicking human output
This distinction matters more than people realize.

What Animals Get For Free:

  • A zebra runs minutes after birth
  • Instincts encoded in DNA
  • Physical bodies that force real-world learning
  • Evolutionary pressure over millions of years

What AI Gets Instead:

  • Perfect memory of training data (actually a bug, not a feature)
  • Zero physical constraints
  • Ability to be copied infinitely
  • Training on human outputs, not human learning processes
Karpathy's insight: The compression is backwards.
Evolution compresses 4 billion years into 3 gigabytes of DNA. AI compresses 15 trillion internet tokens into billions of parameters. Both involve massive information loss, but in completely different ways.

Why Reinforcement Learning Is Terrible (But We Need It Anyway)

Imagine trying to solve a math problem for 10 minutes. You try hundreds of approaches. Finally, one works.
Current RL approach: Upweight every single thing you did in the winning attempt. Even the wrong turns. Even the dead ends. Even the parts where you got lucky.
"You're sucking supervision through a straw," Karpathy explains.
The problems:
  1. Extreme noise - You reward incorrect reasoning if it accidentally led to the right answer
  1. Sparse feedback - One number at the end for 10 minutes of work
  1. No human-like review - Humans analyze what worked and what didn't; AI just upweights everything
Humans don't use reinforcement learning for intelligence tasks. We use it for motor skills - throwing a basketball, learning to balance. For cognitive work? We do something completely different.
The missing piece: Reflection and review.
Not just generating synthetic problems. Not just getting feedback. Actually thinking through what happened, reconciling it with existing knowledge, generating new understanding.
We have no good equivalent for this in AI yet.

The Cognitive Core vs. The Memory Problem

Here's a counterintuitive prediction from someone who's been in AI for 15 years:
The future of AI might be smaller, not bigger.
Current models:
  • Billions of parameters
  • Trained on 15 trillion tokens
  • 0.07 bits stored per token seen
  • Like a hazy recollection of the internet
What we actually want:
  • The algorithms for thinking (keep this)
  • Without all the memorized facts (delete this)
  • Maybe just 1 billion parameters total
Why? Because memory is holding AI back.
Models are too good at memorization. They recite passages verbatim. They rely on remembering rather than reasoning. They struggle when you ask them to go "off the data manifold" - to think about things not in their training set.
Better solution: Small cognitive core + ability to look things up + real learning mechanisms.
Like how humans work. You don't memorize every fact. You build thinking frameworks and look up details when needed.

The March of Nines: Why Timelines Matter

Karpathy keeps saying "decade of agents, not year of agents."
This triggers people who want faster timelines. But he's seen this movie before.

Missing Pieces (Each Takes Years):

1. Continual Learning
  • Models can't remember what you tell them
  • Every conversation starts from scratch
  • No equivalent of human sleep/consolidation
2. Real Multimodality
  • Not just "can process images"
  • Actually understanding vision + language + action together
  • Computer use that actually works reliably
3. Context Management
  • Humans build up context during the day
  • Something magical happens during sleep (distillation into weights)
  • AI has no equivalent process yet
4. Culture and Collaboration
  • No AI equivalent of writing books for other AIs
  • No self-play for cognitive tasks (like AlphaGo had for Go)
  • No organizations or shared knowledge building
5. Actual Reasoning
  • Current "thinking" is still pattern matching
  • No second-order thinking about consequences
  • Struggling with anything requiring true novel insight

The Autonomy Slider (Not Binary Replacement)

Stop thinking "AI replaces humans" or "AI doesn't replace humans."
Start thinking: What percentage of this job can AI handle?

Call Center Work (Easy to Automate):

  • ✓ Repetitive tasks
  • ✓ Clear success metrics
  • ✓ Short time horizons (minutes)
  • ✓ Purely digital
  • ✓ Limited context needed
Result: 80% AI, 20% human oversight, humans manage teams of 5 AI agents

Software Engineering (Hard to Automate):

  • ✗ Novel codebases (never seen before)
  • ✗ Security-critical decisions
  • ✗ Long-term consequences
  • ✗ Integration with human teams
Result: Autocomplete works great, full automation fails

Radiologists (Complex Reality):

  • More complicated than expected
  • Not just "computer vision problem"
  • Messy job with patient interaction
  • Context-dependent decisions
  • Wages actually went up (bottleneck effect)
The pattern: AI handles volume, humans handle edge cases.
And if you're the human handling edge cases for a critical bottleneck? Your value skyrockets.

Why Coding Is Different (And Why That Matters)

If AGI is supposed to handle all knowledge work, why is it overwhelmingly just helping with coding?

Why Code Is The Perfect First Target:

1. Text-Native
  • Everything is already text
  • LLMs love text
  • No translation needed
2. Pre-Built Infrastructure
  • IDEs already exist
  • Diff tools show changes
  • Testing frameworks verify correctness
  • Version control tracks everything
3. Instant Verification
  • Code either runs or doesn't
  • Tests pass or fail
  • No ambiguity in feedback

Why Other Domains Are Harder:

Slides/Presentations:
  • Visual + spatial reasoning
  • No diff tools
  • Subjective quality metrics
  • Context-dependent effectiveness
Writing/Content:
  • High entropy in valid outputs
  • Subjective quality
  • Requires genuine creativity
  • Easy to spot AI "collapse" (same patterns repeated)
Even Andy Matuschak, after trying 50+ approaches, couldn't get AI to write good spaced-repetition cards. Pure language in, language out. Should be perfect for LLMs. Still didn't work well enough.
The lesson: Even in purely linguistic domains, current AI struggles with nuanced creative work.

Model Collapse and The Entropy Problem

Here's a weird fact: If AI trains on too much of its own output, it gets dumber.
Ask ChatGPT to tell you a joke. It has like three jokes. It's "silently collapsed" - giving you a tiny slice of the possible joke space.

The Problem With Synthetic Data:

  • Any individual AI-generated example looks fine
  • But sample 10 times? They're eerily similar
  • Keep training on this? The model collapses further
  • Eventually: Degenerate "duh duh duh" outputs

Why Humans Don't Collapse (As Fast):

  • We maintain entropy through randomness
  • We interact with other humans (fresh perspectives)
  • We encounter genuinely novel situations
  • But even humans collapse over time (older = more rigid thinking)
The insight: Children are "not yet overfit." They haven't collapsed into adult patterns. This is why they seem creative - they haven't learned what "doesn't work" yet.
Maybe dreaming prevents overfitting. Maybe that's why we need genuine novelty and surprise in our lives.

The Education Revolution (Or: Why Humans Won't Become Obsolete)

Karpathy's building Eureka Labs with a specific vision:
"Starfleet Academy for the AI age."
Not because education matters less in an AI world. Because it matters infinitely more.

The Korean Tutor Insight

He was learning Korean three ways:
  1. Self-taught from internet materials
  1. Group class with 10 students
  1. One-on-one tutor
The one-on-one tutor was transcendent. Why?
  • Instantly understood his exact knowledge level
  • Served perfectly-calibrated challenges (not too hard, not too easy)
  • Probed to reveal gaps in understanding
  • Made him the only bottleneck to learning
"I felt like I was the only constraint. The information was perfect."
No AI can do this yet. Not even close.
But when AI can? When everyone has access to a perfect tutor for any subject?

Post-AGI Education

Pre-AGI: Education is useful (helps you make money)
Post-AGI: Education is fun (like going to the gym)
We don't need physical strength to manipulate objects - we have machines. But people still go to the gym. Why?
  • It feels good
  • You look better
  • It's psychologically satisfying
  • Evolutionary programming
Same with learning. Even if AI does all economically valuable work, humans will still want to learn.
Because learning, done properly, feels amazing.

The Geniuses Are Barely Scratching The Surface

Here's the optimistic take:
Current geniuses with access to the best resources are operating at maybe 10% of what a human brain can actually do.
Why so low?

Current Bottlenecks:

  • Bounce off material that's too hard
  • Get bored by material that's too easy
  • Can't find the right on-ramp to knowledge
  • Waste time searching instead of learning
  • Never get appropriate challenge level

With Perfect AI Tutors:

  • Always perfectly challenged
  • Never stuck, never bored
  • Optimal difficulty at all times
  • Learning becomes addictive (like gym)
  • Anyone can speak 5 languages, because why not?
The vision: Not Wall-E humans getting dumber. Superhuman humans with AI-augmented learning.
"I care about what happens to humans. I want humans to be well off in this future."

Why Timelines Are Longer Than You Think

Translation invariance in time: Look back 10 years.
2015:
  • Convolutional neural networks
  • ResNet just released
  • No transformers
  • No LLMs as we know them
2025 (predicted):
  • Still training giant neural networks
  • Still using gradient descent
  • But everything is bigger
  • And the details are different

What Consistently Improves Together:

  1. Algorithms - New architectures, better training methods
  1. Data - More, cleaner, better curated
  1. Compute - Faster chips, better kernels
  1. Systems - Software stack improvements
None dominates. All improve in parallel. All are necessary.
Karpathy reproduced Yann LeCun's 1989 digit recognition:
  • With 2022 algorithms alone: Halved the error rate
  • Needed 10x more data for further gains
  • Needed much more compute for more gains
  • Needed better regularization for more gains
Progress requires everything improving simultaneously.

The Intelligence Explosion Is Already Happening (You're Living In It)

Controversial take: There won't be a discrete "foom" moment.
We've been in an intelligence explosion for decades.
Look at GDP. It's an exponential that keeps going. You can't find computers in it. You can't find the internet in it. You can't find mobile phones in it.
Why? Because everything diffuses slowly. Even "revolutionary" technologies take decades to fully deploy.

The iPhone Didn't Change GDP

  • Released 2008
  • No app store initially
  • Missing many features
  • Slow diffusion across society
  • Averaged into same exponential growth
AI will be the same.

Why There Won't Be A Discontinuity:

  • "AGI in a box" is a fantasy
  • Systems fail at unpredictable things
  • Gradual deployment, gradual learning
  • Society refactors around capabilities
  • Humans stay in the loop longer than expected
The pattern: Automation has been recursive self-improvement since the Industrial Revolution.
Compilers helped engineers write better compilers. Search engines helped engineers build better search engines. IDEs helped engineers build better IDEs.
AI-assisted AI research? Business as usual.
Just faster.

What Actually Works Right Now

Stop believing demos. Start shipping products.

Karpathy's Nano-Hat Lessons:

Built an 8,000-line repository showing the complete pipeline for building ChatGPT from scratch.
What AI was useless for:
  • Novel code architecture
  • Intellectually intense design
  • Understanding custom implementations
  • Avoiding deprecated APIs
What AI was great for:
  • Boilerplate code
  • Rust translation (from Python he understood)
  • Autocomplete for common patterns
  • Languages/paradigms he wasn't expert in

The Pattern:

  • High-bandwidth communication: Point to code, type 3 letters, get completion
  • Low-bandwidth communication: Type full English description, get bloated mess
  • Best use: Lower accessibility barriers to new languages/tools
  • Worst use: Replacing human architectural thinking
The sweet spot: Autocomplete is amazing. VIP coding for novel work is still slop.

Three Frameworks For Thinking About AI Progress

1. First Principles on Intelligence

Don't start with "what does the brain do?"
Start with: "What can we actually build?"
We're not running evolution. We're running imitation learning on internet documents. This creates a different kind of intelligence.
Practical question: What works with our technology stack?
Not: What would be theoretically perfect?

2. The Pareto Principle Applied

Look for the first-order terms. What actually matters?
Micrograd: 100 lines of Python that captures ALL of neural network training. Everything else is efficiency.
The transformer: Start with a lookup table (bigram). Add pieces only when you understand why you need them. Every addition solves a specific problem.
Education: Find the simplest demonstration that shows the core concept. Then build complexity.

3. The March of Nines Framework

For any technology going from demo to product:
  • Identify current reliability (usually ~90%)
  • Each nine of reliability = constant work
  • Count how many nines you need (safety-critical = many)
  • Multiply: That's your timeline
Self-driving: Needed 5+ nines, got 3-4 over 5 years, still needs more
Coding assistants: Need 2-3 nines, mostly there for autocomplete
General agents: Need 4+ nines, currently at ~1.5 nines

The Actionable Takeaways

If You're Building With AI:

  1. Use autocomplete religiously - It's the highest signal-to-noise ratio
  1. Save VIP coding for boilerplate - Not novel architecture
  1. Test everything - AI makes confident mistakes
  1. Learn the core technology - Don't just prompt; understand
  1. Expect the march of nines - Demos are 10% of the journey

If You're Learning:

  1. Build things - Reading papers isn't understanding
  1. No copy-paste - Retype everything, reference only
  1. Teach others - Best way to find gaps in understanding
  1. Learn on-demand - Projects before theory
  1. Find the first-order terms - What's the simplest version?

If You're Planning:

  1. Think decades, not years - For fundamental capability improvements
  1. Expect gradual diffusion - Even revolutionary tech deploys slowly
  1. Plan for the autonomy slider - Not binary replacement
  1. Invest in bottleneck skills - Where you're irreplaceable
  1. Stay in the loop - Humans will be relevant longer than predicted

The Bottom Line

AI progress is real. It's just slower and weirder than the hype suggests.
We're building ghosts, not animals. We're training on outputs, not learning processes. We're getting incredible autocomplete, not artificial general intelligence.
The decade of agents, not the year.
And that's actually good news.
It means:
  • More time to adapt
  • More opportunities to learn
  • More ways to add value
  • More space for humans to stay relevant
The geniuses of today are barely scratching the surface of what a human mind can do.
With the right tools, the right education, the right frameworks?
We're just getting started.

One Last Thing: The Physics Insight

Karpathy says everyone should learn physics. Not for the formulas. For the cognitive tools.

What Physics Teaches:

  • Building models and abstractions
  • Understanding first-order vs. second-order effects
  • Approximating complex systems
  • Finding fundamental frequencies in noise
  • The "spherical cow" mindset
A physicist looks at a cow and sees a sphere.
Everyone laughs. But it's brilliant. Because for many problems, a cow IS approximately spherical.
Same with AI. Same with business. Same with life.
Find the first-order terms. Build from there. Add complexity only when needed.
That's how you learn. That's how you build. That's how you win.

Want to build something impossible? Start by making it trivial.
Then work on the march of nines.
Watch the full episode here -
Video preview

Join The Wisdom Project

Get 1 new concept, idea, framework every week to think better, live better and to make better sense of the world around us.

Free Sign Up
Ayush

Written by

Ayush

Writes articles on The Wizdom Project

    Related posts

    The Art of Spending Money | Morgan Housel | Summary | Podcast Notes | YouTube | BooksThe Art of Spending Money | Morgan Housel | Summary | Podcast Notes | YouTube | Books
    Peter Thiel's Shocking Predictions: Why Humanity May Be Facing Extinction (Joe Rogan Podcast) | Podcast Notes | YouTube SummaryPeter Thiel's Shocking Predictions: Why Humanity May Be Facing Extinction (Joe Rogan Podcast) | Podcast Notes | YouTube Summary
    Naval Ravikant's Guide to Success Without Sacrifice: 15 Life-Changing Insights on Happiness, Wealth, and Freedom | Podcast Notes | YouTube SummaryNaval Ravikant's Guide to Success Without Sacrifice: 15 Life-Changing Insights on Happiness, Wealth, and Freedom | Podcast Notes | YouTube Summary
    Ray Dalio Warns: US Debt Crisis Imminent - The $36 Trillion Problem & Your Investment Strategy for 2025 | Podcast Notes and Summary | The Wisdom ProjectRay Dalio Warns: US Debt Crisis Imminent - The $36 Trillion Problem & Your Investment Strategy for 2025 | Podcast Notes and Summary | The Wisdom Project
    Huberman Sleep Protocols | The Wisdom Project | Podcast Notes | YouTube SummaryHuberman Sleep Protocols | The Wisdom Project | Podcast Notes | YouTube Summary
    The Real AI Future: Not Doom, Not Hype - Just Smart Reality | Ben Evans | The Knowledge Project | Podcast Notes | YouTube SummaryThe Real AI Future: Not Doom, Not Hype - Just Smart Reality | Ben Evans | The Knowledge Project | Podcast Notes | YouTube Summary
    Elon Musk Exposes $300 Billion Government Fraud on Joe Rogan Podcast | Full Breakdown | Podcast Notes | YouTube Summary Elon Musk Exposes $300 Billion Government Fraud on Joe Rogan Podcast | Full Breakdown | Podcast Notes | YouTube Summary