Andrej Karpathy Keynote at YC AI Startup School
Yesterday2:05 AM
39:22
Action Items
Overview
Karpathy outlined three software evolution paradigms: Software 1.0 (traditional code), Software 2.0 (neural network weights), and Software 3.0 (LLM prompts in English)
LLMs function like 1960s-era operating systems with expensive compute requiring cloud-based timesharing, but uniquely diffuse to consumers first rather than governments/corporations
Partial autonomy applications like Cursor and Perplexity represent the optimal approach—combining traditional interfaces with LLM integration, custom GUIs for verification, and autonomy sliders for user control
Vibe coding democratizes programming by enabling anyone to build software using natural language, though deployment infrastructure remains complex
Building for AI agents requires LLM-friendly documentation formats, direct API access, and tools that make digital information easily ingestible
Software evolution paradigms
Karpathy identified three distinct software paradigms that have emerged rapidly in recent years after 70 years of relative stability
Software 1.0 consists of traditional computer code written by humans to program computers
Software 2.0 represents neural network weights where developers tune datasets and run optimizers rather than writing code directly
Software 3.0 emerged with large language models where prompts written in English serve as programs that control LLMs
Karpathy observed at Tesla that Software 2.0 (neural networks) progressively "ate through" the autopilot software stack, replacing C code with neural network capabilities
The same pattern is occurring again with Software 3.0, where LLM-based solutions are replacing both traditional code and neural networks
GitHub now contains significant amounts of English text interspersed with code, reflecting this paradigm shift
LLM ecosystem analogies
Karpathy compared LLMs to utilities where labs spend capex to train models (like building electrical grids) and opex to serve intelligence via APIs with metered access
LLMs also resemble semiconductor fabs due to massive capex requirements and centralized R&D secrets, though software's malleability makes them less defensible
The strongest analogy positions LLMs as operating systems with closed-source providers (like Windows/macOS) and open-source alternatives (LLAMA ecosystem resembling Linux)
LLMs function as new computers where the model serves as CPU, context windows act as memory, and the system orchestrates compute for problem-solving
Current LLM computing resembles 1960s-era mainframes with expensive centralized compute requiring timesharing and thin client access over networks
Personal LLM computing hasn't emerged yet due to economics, though Mac Minis show promise for batch inference workloads
LLMs uniquely flip technology diffusion patterns—consumers adopt first (helping with cooking) while governments and corporations lag behind, opposite of historical technology adoption
LLM psychology and limitations
Karpathy described LLMs as "stochastic simulations of people" or "people spirits" created by autoregressive transformers trained on internet text
LLMs possess encyclopedic knowledge and memory capabilities far exceeding individual humans, similar to the autistic savant character in Rain Man who could memorize phone books
Key cognitive deficits include hallucination, poor self-knowledge, and "jagged intelligence" where they're superhuman in some domains but make basic errors humans wouldn't
LLMs suffer from "anterograde amnesia"—unlike human coworkers who learn organizational context over time, LLMs don't natively consolidate knowledge or develop expertise
Context windows function as working memory that must be programmed directly, similar to protagonists in Memento and 50 First Dates whose memory resets daily
Security limitations include gullibility, susceptibility to prompt injection, and potential data leakage
Users must simultaneously leverage superhuman capabilities while working around significant cognitive deficits
Partial autonomy applications
Karpathy advocated for partial autonomy apps over direct LLM interaction, using Cursor as the prime example for coding assistance
Successful LLM apps share common features: extensive context management, orchestration of multiple LLM calls, application-specific GUIs for human verification, and autonomy sliders
Cursor demonstrates the autonomy slider concept with tab completion (minimal autonomy), Command+K (chunk-level changes), Command+L (file-level changes), and Command+I (repo-level autonomy)
Perplexity exemplifies similar patterns with quick search, research, and deep research modes representing different autonomy levels
The human-AI collaboration pattern involves AI generation and human verification, requiring fast verification loops through visual GUIs rather than text-heavy interfaces
Karpathy emphasized keeping "AI on the leash" to avoid overwhelming users with 1000-line code diffs that become verification bottlenecks
Drawing from 5 years at Tesla working on autopilot, he noted that even after a perfect 30-minute Waymo demo in 2013, full driving autonomy remains unsolved 12 years later
The Iron Man suit analogy illustrates the ideal balance—both augmentation tool and autonomous agent with user-controlled autonomy levels
Human-AI collaboration patterns
The optimal collaboration model positions humans as verifiers and AIs as generators, requiring extremely fast generation-verification loops
Two key strategies accelerate this loop: speeding up verification through visual GUIs that leverage human computer vision capabilities, and keeping AI constrained to manageable chunks
GUIs provide "highways to your brain" since visual processing is effortless compared to reading text, making verification faster and more enjoyable
Successful prompting requires concrete, specific instructions to increase verification success probability and avoid spinning cycles
Karpathy's education work separates teacher course creation from student course delivery, using intermediate course artifacts to keep AI constrained to specific syllabi and progressions
Best practices include working in small incremental chunks, focusing on single concrete tasks, and developing techniques to maintain AI focus and prevent "getting lost in the woods"
Vibe coding democratization
Vibe coding enables anyone to program using natural language, eliminating the traditional 5-10 year learning curve for software development
Karpathy's viral tweet about "programming computers in English" became a major meme and now has a Wikipedia page, though he couldn't predict its popularity
Kids vibe coding represents a "gateway drug to software development" and demonstrates the positive potential of democratized programming
Karpathy successfully built iOS apps despite not knowing Swift, and created MenuGen (menugen.app) to generate restaurant menu images
MenuGen provides $5 in free credits but operates as a "negative revenue app" due to high AI generation costs
The key insight: vibe coding makes the actual coding trivial (hours), but deployment infrastructure remains complex (weeks of DevOps work)
Traditional setup processes like Google login integration require extensive manual clicking and configuration that should be automated for AI agents
Building for AI agents
AI agents represent a new category of digital information consumers—"people spirits on the Internet" that need software infrastructure designed for them
LLMs.txt files (similar to robots.txt) can instruct LLMs about domain content in easily readable markdown format
Companies like Vercel and Stripe are creating LLM-specific documentation in markdown, replacing human-oriented formatting with bold text and images
Documentation must eliminate "click" instructions and replace them with equivalent curl commands that LLM agents can execute
Tools like git-ingest convert GitHub repos into LLM-friendly concatenated text by changing URLs from github.com to gitingest.com
DeepWiki goes further by having AI agents analyze repos and generate comprehensive documentation pages
Anthropic's Model Context Protocol provides a standardized way for applications to communicate directly with AI agents
While LLMs can potentially navigate traditional interfaces through clicking, meeting them halfway with agent-friendly formats remains more efficient and cost-effective
The long tail of software that won't adapt to AI agents will require these conversion tools, but active platforms should build native agent support
Was this useful? This helps improve our AI writing.
Expires in a year