The original is one click away. Open original ↗
Why AI won't become an uncontrollable alien mind
Executive overview
The fear that AI will become smarter than we expect rests on conflating two separate things: the language model itself and the control logic wrapped around it. A large language model is a token generator — a feed-forward network that produces one word at a time. It cannot act on the world.
What makes AI systems capable of acting is the control logic layered on top. That logic is hand-coded by humans. It is fully inspectable, fully constrainable, and fully our responsibility. The risk is not emergent superintelligence — it is developers failing to code the right guardrails.
The language model is inert; the danger and the opportunity both live in the control logic we choose to build.
The alien mind fear and where it comes from
- Popular concern holds that as models scale, they may develop unexpected, uncontrollable intelligence.
- Key sources: Harari/Harris 2023 NYT op-ed ("we have summoned an alien intelligence") and the Microsoft Research "Sparks of AGI" paper on GPT-4.
- The rational extrapolation: GPT-3 → GPT-4 → GPT-5 gets progressively more capable in ways we don't understand, until safety is at stake.
- This extrapolation conflates the generative model with the full system.
What a language model actually does
- Takes input, passes it forward through layers, outputs a probability distribution over the next token.
- Pattern recognition works like a massive combinatorial checklist: identifies properties of the context, applies rule-books to select the most appropriate next token.
- Can generalise to combinations of properties never seen in training — this is impressive, but still just token production.
- No matter how large the model, it cannot take action, hold state, or pursue goals. It is a "word spitter."
The four layers of control logic
- Layer 0 — autoregression and context management: repeatedly calls the model to build a full response; appends conversation history to simulate memory. Example: basic ChatGPT.
- Layer 1 — transformation and actuation: rewrites the user's input before it reaches the model (e.g. fetching live web results); takes action based on model output (e.g. booking flights, editing a Word document). Examples: Gemini web search, early OpenAI plugins, Microsoft Copilot.
- Layer 2 — stateful planning: control logic maintains a complex plan, makes many sequential model calls, tests outputs, and iterates. Examples: Meta's Cicero (diplomacy bot), Devin (AI coding agent).
- Layer 3 (hypothetical) — would involve a master control logic orchestrating multiple models with persistent world-state, approaching AGI. Not yet built; no one is close.
Why the control logic is the key safety lever
- Layers 0–2 are hand-coded by humans. Developers know exactly what they do.
- The language model cannot "break through" and commandeer the control logic — it can only output tokens.
- Cicero example: developers simply excluded lying moves from the simulation. Easy, because the planner is their own code.
- The runaway self-improvement scenario (layer-3 control logic that rewrites itself) requires deliberate, complex engineering that no one is currently building or knows how to build.
- Practical risks are real but mundane: missing guardrails (e.g. no spending cap → $20k Emirates booking; no resource limit → runaway compute loop). These are engineering oversights, not emergent intelligence.
Intentional AI (IAI)
- Newport's framing: IAI — the control logic can encode enormous intention and values even when the underlying generative model is uninterpretable.
- Policy implication: developers should be held liable for their system's actuations. Liability shifts attention to the control layer — where it belongs.
- Key constraints to enforce in control logic: spending limits, no autonomous self-modification, restricted access to computational resources, no missile or weapons actuation without human oversight.
AI disinformation: the real and limited risk
- Disinformation requires two things: a large pool of potentially viral content and a recommendation algorithm to surface the stickiest items.
- LLMs expand the pool of bad information, but for high-profile topics the pool is already full. Adding mediocre AI-generated content does not displace human-crafted sticky content.
- The real risk is hyper-targeted disinformation on niche topics where the pool was previously empty (e.g. a specific county election). A low-skill actor can now produce plausible content where none existed before.
- Mitigation: the same internet literacy that has been needed for 15 years; no fundamentally new solution exists.
AI capabilities and the current form factor
- Measurement of model-to-model improvements is vague today because the "mega-oracle" chatbot is not the end-state product.
- The trajectory is toward smaller, specialised, actuated models integrated into existing tools (GitHub Copilot, Apple Intelligence, voice interfaces).
- As models become task-specific, capabilities will be more clearly enumerable.
- After 18 months, the chat-interface form factor has produced limited real-world disruption. Newport compares it to the Mosaic browser era — impressive but not yet the viral vector for adoption.
Pseudo-productivity and the mouse jiggler problem
- Knowledge work since the 1950s has used pseudo-productivity: visible activity as a proxy for useful effort, because there were no better metrics.
- Mobile connectivity made pseudo-productivity pathological: workers are now expected to demonstrate effort at all hours via email and Slack replies.
- The mouse jiggler (software that simulates mouse movement to keep Slack status "active") is the absurdist endpoint of this dynamic.
- Fix: replace pseudo-productivity with results-oriented management — fewer concurrent projects, sequential focus, quality over busyness. Outlined in Slow Productivity.
Distributed webs of trust vs. recommendation algorithms
- Recommendation algorithms + user-generated content + popularity feedback = a loop that optimises for hyper-palatable, amygdala-targeting content.
- Podcasts and email newsletters already operate on distributed webs of trust: discovery happens through human referrals, not algorithmic push. This model works.
- Breaking discovery and consumption apart: consumption is solved (RSS, podcast apps); discovery should move back toward human-to-human curation.
- Recommendation algorithms are fine in closed contexts (Netflix, Amazon) where they are not coupled with user-generated content and popularity feedback.
Workload management for overloaded teams
- Pull-based, Kanban-style project boards prevent log-jams caused by too many concurrent projects.
- Each person works on one or two things at a time; new work is pulled when a slot opens.
- Projects that sit unpulled for a month are removed — filters momentary enthusiasm from genuine priorities.
- Transparent workload visibility replaces the hive-mind of constant reactive messaging.
More like this — when you're ready for early access.
Join the waitlist for a personal account and content recommendations based on what you're working on.
No spam. Unsubscribe at any time.
You're on the list. We'll be in touch before launch.