Andrej Karpathy Keynote at YC AI Startup School

Jun 19

2:05 AM

39:22

Action Items

Overview

Karpathy outlined three software evolution paradigms: Software 1.0 (traditional code), Software 2.0 (neural network weights), and Software 3.0 (LLM prompts in English)

LLMs function like 1960s-era operating systems with expensive compute requiring cloud-based timesharing, but uniquely diffuse to consumers first rather than governments/corporations

Partial autonomy applications like Cursor and Perplexity represent the optimal approach—combining traditional interfaces with LLM integration, custom GUIs for verification, and autonomy sliders for user control

Vibe coding democratizes programming by enabling anyone to build software using natural language, though deployment infrastructure remains complex

Building for AI agents requires LLM-friendly documentation formats, direct API access, and tools that make digital information easily ingestible

Software evolution paradigms

Karpathy identified three distinct software paradigms that have emerged rapidly in recent years after 70 years of relative stability

Software 1.0 consists of traditional computer code written by humans to program computers

Software 2.0 represents neural network weights where developers tune datasets and run optimizers rather than writing code directly

Software 3.0 emerged with large language models where prompts written in English serve as programs that control LLMs

Karpathy observed at Tesla that Software 2.0 (neural networks) progressively "ate through" the autopilot software stack, replacing C code with neural network capabilities

The same pattern is occurring again with Software 3.0, where LLM-based solutions are replacing both traditional code and neural networks

GitHub now contains significant amounts of English text interspersed with code, reflecting this paradigm shift

LLM ecosystem analogies

Karpathy compared LLMs to utilities where labs spend capex to train models (like building electrical grids) and opex to serve intelligence via APIs with metered access

LLMs also resemble semiconductor fabs due to massive capex requirements and centralized R&D secrets, though software's malleability makes them less defensible

The strongest analogy positions LLMs as operating systems with closed-source providers (like Windows/macOS) and open-source alternatives (LLAMA ecosystem resembling Linux)

LLMs function as new computers where the model serves as CPU, context windows act as memory, and the system orchestrates compute for problem-solving

Current LLM computing resembles 1960s-era mainframes with expensive centralized compute requiring timesharing and thin client access over networks

Personal LLM computing hasn't emerged yet due to economics, though Mac Minis show promise for batch inference workloads

LLMs uniquely flip technology diffusion patterns—consumers adopt first (helping with cooking) while governments and corporations lag behind, opposite of historical technology adoption

LLM psychology and limitations

Karpathy described LLMs as "stochastic simulations of people" or "people spirits" created by autoregressive transformers trained on internet text

LLMs possess encyclopedic knowledge and memory capabilities far exceeding individual humans, similar to the autistic savant character in Rain Man who could memorize phone books

Key cognitive deficits include hallucination, poor self-knowledge, and "jagged intelligence" where they're superhuman in some domains but make basic errors humans wouldn't

LLMs suffer from "anterograde amnesia"—unlike human coworkers who learn organizational context over time, LLMs don't natively consolidate knowledge or develop expertise

Context windows function as working memory that must be programmed directly, similar to protagonists in Memento and 50 First Dates whose memory resets daily

Security limitations include gullibility, susceptibility to prompt injection, and potential data leakage

Users must simultaneously leverage superhuman capabilities while working around significant cognitive deficits

Partial autonomy applications

Karpathy advocated for partial autonomy apps over direct LLM interaction, using Cursor as the prime example for coding assistance

Successful LLM apps share common features: extensive context management, orchestration of multiple LLM calls, application-specific GUIs for human verification, and autonomy sliders

Cursor demonstrates the autonomy slider concept with tab completion (minimal autonomy), Command+K (chunk-level changes), Command+L (file-level changes), and Command+I (repo-level autonomy)

Perplexity exemplifies similar patterns with quick search, research, and deep research modes representing different autonomy levels

The human-AI collaboration pattern involves AI generation and human verification, requiring fast verification loops through visual GUIs rather than text-heavy interfaces

Karpathy emphasized keeping "AI on the leash" to avoid overwhelming users with 1000-line code diffs that become verification bottlenecks

Drawing from 5 years at Tesla working on autopilot, he noted that even after a perfect 30-minute Waymo demo in 2013, full driving autonomy remains unsolved 12 years later

The Iron Man suit analogy illustrates the ideal balance—both augmentation tool and autonomous agent with user-controlled autonomy levels

Human-AI collaboration patterns

The optimal collaboration model positions humans as verifiers and AIs as generators, requiring extremely fast generation-verification loops

Two key strategies accelerate this loop: speeding up verification through visual GUIs that leverage human computer vision capabilities, and keeping AI constrained to manageable chunks

GUIs provide "highways to your brain" since visual processing is effortless compared to reading text, making verification faster and more enjoyable

Successful prompting requires concrete, specific instructions to increase verification success probability and avoid spinning cycles

Karpathy's education work separates teacher course creation from student course delivery, using intermediate course artifacts to keep AI constrained to specific syllabi and progressions

Best practices include working in small incremental chunks, focusing on single concrete tasks, and developing techniques to maintain AI focus and prevent "getting lost in the woods"

Vibe coding democratization

Vibe coding enables anyone to program using natural language, eliminating the traditional 5-10 year learning curve for software development

Karpathy's viral tweet about "programming computers in English" became a major meme and now has a Wikipedia page, though he couldn't predict its popularity

Kids vibe coding represents a "gateway drug to software development" and demonstrates the positive potential of democratized programming

Karpathy successfully built iOS apps despite not knowing Swift, and created MenuGen (menugen.app) to generate restaurant menu images

MenuGen provides $5 in free credits but operates as a "negative revenue app" due to high AI generation costs

The key insight: vibe coding makes the actual coding trivial (hours), but deployment infrastructure remains complex (weeks of DevOps work)

Traditional setup processes like Google login integration require extensive manual clicking and configuration that should be automated for AI agents

Building for AI agents

AI agents represent a new category of digital information consumers—"people spirits on the Internet" that need software infrastructure designed for them

LLMs.txt files (similar to robots.txt) can instruct LLMs about domain content in easily readable markdown format

Companies like Vercel and Stripe are creating LLM-specific documentation in markdown, replacing human-oriented formatting with bold text and images

Documentation must eliminate "click" instructions and replace them with equivalent curl commands that LLM agents can execute

Tools like git-ingest convert GitHub repos into LLM-friendly concatenated text by changing URLs from github.com to gitingest.com

DeepWiki goes further by having AI agents analyze repos and generate comprehensive documentation pages

Anthropic's Model Context Protocol provides a standardized way for applications to communicate directly with AI agents

While LLMs can potentially navigate traditional interfaces through clicking, meeting them halfway with agent-friendly formats remains more efficient and cost-effective

The long tail of software that won't adapt to AI agents will require these conversion tools, but active platforms should build native agent support