The original is one click away. Open original ↗
AI-driven operating systems: architecture, components, and security
Executive overview
Current operating systems use a kernel to bridge applications and hardware. As AI models grow more capable, a new OS architecture is emerging where a large language model replaces the kernel as the central orchestrator.
The framework maps every traditional OS component — CPU, RAM, storage, app store, I/O — to an LLM equivalent. Security must be rearchitected alongside, not bolted on after.
The core insight: an LLM-centric OS mirrors the structure of traditional operating systems almost exactly, down to the open/closed-source ecosystem split.
Traditional OS vs. AI OS: the component mapping
- Kernel / CPU → orchestrator LLM (OpenAI, Anthropic, Gemini, etc.)
- RAM → context window (short-term, fast-access memory)
- Virtual memory → MemGPT-style system swapping data between context window and vector DB
- External storage → vector database (slower, higher capacity)
- Software tools → Python interpreters, calculators, terminals (Software 1.0)
- App store → fine-tuned LLMs pulled in for specialist tasks (e.g. AlphaFold for code)
- Browser / internet → Bing search and equivalent retrieval integrations
- I/O devices → gesture and voice (keyboard/mouse replaced over time)
Device I/O: the shift to gesture and voice
- Apple Vision Pro demonstrates high-quality hand and eye tracking at consumer scale
- OpenAI's ChatGPT voice mode changes human-machine interaction more fundamentally than GPT-4 itself
- Humane AI Pin shows AI embedded directly in wearables, summarising emails and ambient context
- These signals converge: future OS interaction will be primarily voice and gesture, not text input
The kernel: orchestrator LLMs and what comes next
- The orchestrator LLM is the privileged core — equivalent to the kernel's ring-0 access
- Context window capacity today (~128k tokens) is roughly equivalent to Apple's 1983 computer memory — early days
- Autonomous agents: the orchestrator delegates subtasks to specialist LLMs working as a swarm, enabling macro tasks like "build a website" without step-by-step user prompting
- Reinforcement learning: future OS tasks may use reward/punishment functions so the system learns from outputs — e.g. code that compiles earns a reward; code that throws errors earns a penalty
- Version 2–3 of this OS paradigm will likely have autonomous agents handling tasks without constant user prompting
Memory: virtual memory for LLMs
- MemGPT recreates the classic OS virtual memory concept for LLMs
- Context window = RAM; vector database = disk; the system swaps data in real-time based on what the user needs
- Effect: the AI appears to have infinite memory — it can recall a book mentioned three months ago by pulling it from the vector DB into the context window on demand
Software 1.0 tools and their declining role
- Currently needed for Python execution, calculators, terminals — tasks LLMs can't yet do reliably alone
- DeepMind's FunSearch shows LLMs can now derive novel solutions to hard mathematical problems
- Q* rumours suggest LLMs approaching reliable grade-school math — a threshold that matters for tool replacement
- BLIP-2 research: two LLMs communicating directly in higher-dimensional vector space, bypassing human-readable language — faster and potentially more capable than human-mediated handoffs
- Long-run prediction: LLMs will become independent of Software 1.0 tools as reasoning and communication capabilities mature
The open/closed-source ecosystem split
- Closed-source LLMs (OpenAI, Anthropic, Gemini) mirror the role of Windows and macOS
- Open-source LLMs (Llama, Falcon, Mistral) mirror Linux — a foundation others build on
- Caveat: "open source" in LLM terms usually means weights released, not training data or code — critics call this "open model", not true open source
Security: old practices adapted for the AI OS
- Authentication: biometrics (voice, fingerprint) combined with passkeys — patents already filed to detect synthetically generated biometrics
- Software supply chain: LLMs pulled from the app store need provenance verification (analogous to SLSA for traditional software)
- Encryption: vector databases and all data in transit must be encrypted at rest, in transit, and eventually in use via homomorphic encryption
Security: practices with a new twist
- Dual LLM model: a privileged LLM (internet access, agents, storage) handles data retrieval; a quarantined LLM (no external access) performs actions — malicious payloads in retrieved data cannot phone home
- Control tokens: currently theoretical — the idea is to embed privileged tokens in the system prompt so the LLM can distinguish system instructions from user input during tokenisation, preventing prompt injection from overriding system intent
- Special tokens (existing workaround): structured delimiters in system prompts (e.g.
[code]...[/code]) help the LLM recognise where user-supplied data begins and ends
Security: net-new practices
- Input/output evaluation LLMs: separate evaluator models sit before and after the core LLM — the input evaluator screens for malicious prompts; the output evaluator checks for vulnerable code, data leakage, or policy violations
- Output evaluation is harder to bypass than input evaluation — a crafty prompt may fool the input screen but the output is processed post-execution where manipulation is constrained
- Vulnerability scanning via prompt injection red-teaming: tools like Garak and PromptMap fire large databases of known injection techniques at an LLM to identify weaknesses before deployment
More like this — when you're ready for early access.
Join the waitlist for a personal account and content recommendations based on what you're working on.
No spam. Unsubscribe at any time.
You're on the list. We'll be in touch before launch.