The original is one click away. Open original ↗
Software is changing again: Karpathy on LLMs, agents, and building for the AI era
Executive overview
Software is undergoing its second fundamental shift in a few years. Three distinct paradigms now coexist: hand-written code (1.0), neural network weights (2.0), and natural-language prompts (3.0). LLMs are best understood not as utilities or fabs but as operating systems — complex software ecosystems in their 1960s era, currently available via time-sharing.
The core opportunity is building partial autonomy apps: products where humans stay in the loop, GUIs make AI output auditable, and an autonomy slider lets teams dial up automation incrementally. Agents are a decade-long project, not a 2025 event.
The generation–verification loop is the bottleneck; everything should be built to make it faster.
The three software paradigms
- Software 1.0: explicit code written by humans to program computers
- Software 2.0: neural network weights shaped by data and optimisers; Hugging Face is its GitHub
- Software 3.0: natural-language prompts that program LLMs; English is now a programming language
- Each paradigm has eaten into the previous one — as at Tesla Autopilot, C++ was deleted as neural nets grew
- Fluency across all three is increasingly necessary; the right paradigm depends on the task
LLMs as operating systems
- LLM labs spend large capex to train models (building the grid) and sell metered OPEX access — utility-like behaviour
- But LLMs are not simple commodities; they are complex, expanding software ecosystems
- Closer analogy: operating systems — a few closed-source providers (like Windows/macOS) and an open-source alternative (Llama ≈ Linux)
- Context windows are working memory; the LLM orchestrates memory and compute, like an OS orchestrating a CPU
- We are in the 1960s of this new computing era: centralised, time-shared, no personal computing revolution yet
- LLM outages cause intelligence brownouts — already critical infrastructure
The psychology of LLMs
- LLMs are stochastic simulations of people — autoregressive transformers trained on human text
- Superpowers: encyclopedic memory, cross-domain knowledge retrieval
- Jagged intelligence: superhuman in some domains, yet insists 9.11 > 9.9
- Anterograde amnesia: context windows are wiped; they don't accumulate organisational knowledge over time
- Gullible: susceptible to prompt injection, data leakage, manipulation
- Program around the deficits; don't pretend they don't exist
Partial autonomy apps
- Cursor is the template: custom GUI, orchestrated LLM calls, context management, autonomy slider (autocomplete → full-repo agent)
- Perplexity follows the same pattern: auditable sources, quick search through deep research
- The generation–verification loop must be made as fast as possible
- GUIs accelerate verification by using human visual processing rather than reading text
- Keep the AI on a short leash: small, concrete prompts; incremental diffs; avoid 1,000-line changes
- Self-driving cars took 12 years from a perfect demo to near-solved — agents will take similar time
- Build Iron Man suits (augmentation with optional autonomy), not autonomous robots
Building for agents as a new class of user
- Agents are a new consumer of digital infrastructure — human-like but not human, not traditional APIs
llm.txt: simple markdown on your domain telling LLMs what the site is about — far more reliable than HTML parsing- Stripe and Vercel are early movers: publishing docs in markdown, replacing every "click" instruction with the equivalent
curlcommand - Agents cannot click; every instruction requiring a click is a dead end for LLMs today
- Tools like GitIngest concatenate repos into LLM-readable text; Deep Wiki adds AI-generated analysis on top
- Meeting LLMs halfway is still worth doing even as browser-use agents improve — it is cheaper and more reliable
- Model Context Protocol (Anthropic) provides a standard interface for agents to interact with services
Vibe coding and democratising software
- Natural language as the programming interface means everyone is now a potential programmer
- Karpathy built a working iOS app in Swift without knowing Swift; a functional menu-image generator in hours
- The hard part is not the code — it is DevOps, auth, payments, deployment: still click-heavy and human-oriented
- Vibe coding is a gateway drug to software development, not a replacement for engineering
- Making documentation LLM-readable (e.g. feeding the full Manim docs to an LLM) unlocks out-of-the-box use
More like this — when you're ready for early access.
Join the waitlist for a personal account and content recommendations based on what you're working on.
No spam. Unsubscribe at any time.
You're on the list. We'll be in touch before launch.