Software is changing again: Karpathy on LLMs, agents, and building for the AI era

Executive overview

Software is undergoing its second fundamental shift in a few years. Three distinct paradigms now coexist: hand-written code (1.0), neural network weights (2.0), and natural-language prompts (3.0). LLMs are best understood not as utilities or fabs but as operating systems — complex software ecosystems in their 1960s era, currently available via time-sharing.

The core opportunity is building partial autonomy apps: products where humans stay in the loop, GUIs make AI output auditable, and an autonomy slider lets teams dial up automation incrementally. Agents are a decade-long project, not a 2025 event.

The generation–verification loop is the bottleneck; everything should be built to make it faster.

The three software paradigms

Software 1.0: explicit code written by humans to program computers
Software 2.0: neural network weights shaped by data and optimisers; Hugging Face is its GitHub
Software 3.0: natural-language prompts that program LLMs; English is now a programming language
Each paradigm has eaten into the previous one — as at Tesla Autopilot, C++ was deleted as neural nets grew
Fluency across all three is increasingly necessary; the right paradigm depends on the task

LLMs as operating systems

LLM labs spend large capex to train models (building the grid) and sell metered OPEX access — utility-like behaviour
But LLMs are not simple commodities; they are complex, expanding software ecosystems
Closer analogy: operating systems — a few closed-source providers (like Windows/macOS) and an open-source alternative (Llama ≈ Linux)
Context windows are working memory; the LLM orchestrates memory and compute, like an OS orchestrating a CPU
We are in the 1960s of this new computing era: centralised, time-shared, no personal computing revolution yet
LLM outages cause intelligence brownouts — already critical infrastructure

The psychology of LLMs

LLMs are stochastic simulations of people — autoregressive transformers trained on human text
Superpowers: encyclopedic memory, cross-domain knowledge retrieval
Jagged intelligence: superhuman in some domains, yet insists 9.11 > 9.9
Anterograde amnesia: context windows are wiped; they don't accumulate organisational knowledge over time
Gullible: susceptible to prompt injection, data leakage, manipulation
Program around the deficits; don't pretend they don't exist

Partial autonomy apps

Cursor is the template: custom GUI, orchestrated LLM calls, context management, autonomy slider (autocomplete → full-repo agent)
Perplexity follows the same pattern: auditable sources, quick search through deep research
The generation–verification loop must be made as fast as possible
GUIs accelerate verification by using human visual processing rather than reading text
Keep the AI on a short leash: small, concrete prompts; incremental diffs; avoid 1,000-line changes
Self-driving cars took 12 years from a perfect demo to near-solved — agents will take similar time
Build Iron Man suits (augmentation with optional autonomy), not autonomous robots

Building for agents as a new class of user

Agents are a new consumer of digital infrastructure — human-like but not human, not traditional APIs
llm.txt: simple markdown on your domain telling LLMs what the site is about — far more reliable than HTML parsing
Stripe and Vercel are early movers: publishing docs in markdown, replacing every "click" instruction with the equivalent curl command
Agents cannot click; every instruction requiring a click is a dead end for LLMs today
Tools like GitIngest concatenate repos into LLM-readable text; Deep Wiki adds AI-generated analysis on top
Meeting LLMs halfway is still worth doing even as browser-use agents improve — it is cheaper and more reliable
Model Context Protocol (Anthropic) provides a standard interface for agents to interact with services

Vibe coding and democratising software

Natural language as the programming interface means everyone is now a potential programmer
Karpathy built a working iOS app in Swift without knowing Swift; a functional menu-image generator in hours
The hard part is not the code — it is DevOps, auth, payments, deployment: still click-heavy and human-oriented
Vibe coding is a gateway drug to software development, not a replacement for engineering
Making documentation LLM-readable (e.g. feeding the full Manim docs to an LLM) unlocks out-of-the-box use

Software is changing again: Karpathy on LLMs, agents, and building for the AI era

Executive overview

The three software paradigms

LLMs as operating systems

The psychology of LLMs

Partial autonomy apps

Building for agents as a new class of user

Vibe coding and democratising software

More like this — when you're ready for early access.

Get early access to the full library.

Be among the first to get personalised recommendations tailored to your stage in business.

Be among the first to get personalised recommendations tailored to your stage in business.

Executive overview

The three software paradigms

LLMs as operating systems

The psychology of LLMs

Partial autonomy apps

Building for agents as a new class of user

Vibe coding and democratising software

More like this — when you're ready for early access.

More in AI

Building $10,000 software MVPs with AI in under an hour

How to actually make money with AI: five brutal truths

How to choose the right home for your AI workflow

Get early access to the full library.

Be among the first to get personalised recommendations tailored to your stage in business.

Be among the first to get personalised recommendations tailored to your stage in business.