Why AI won't become an uncontrollable alien mind

Executive overview

The fear that AI will become smarter than we expect rests on conflating two separate things: the language model itself and the control logic wrapped around it. A large language model is a token generator — a feed-forward network that produces one word at a time. It cannot act on the world.

What makes AI systems capable of acting is the control logic layered on top. That logic is hand-coded by humans. It is fully inspectable, fully constrainable, and fully our responsibility. The risk is not emergent superintelligence — it is developers failing to code the right guardrails.

The language model is inert; the danger and the opportunity both live in the control logic we choose to build.

The alien mind fear and where it comes from

  • Popular concern holds that as models scale, they may develop unexpected, uncontrollable intelligence.
  • Key sources: Harari/Harris 2023 NYT op-ed ("we have summoned an alien intelligence") and the Microsoft Research "Sparks of AGI" paper on GPT-4.
  • The rational extrapolation: GPT-3 → GPT-4 → GPT-5 gets progressively more capable in ways we don't understand, until safety is at stake.
  • This extrapolation conflates the generative model with the full system.

What a language model actually does

  • Takes input, passes it forward through layers, outputs a probability distribution over the next token.
  • Pattern recognition works like a massive combinatorial checklist: identifies properties of the context, applies rule-books to select the most appropriate next token.
  • Can generalise to combinations of properties never seen in training — this is impressive, but still just token production.
  • No matter how large the model, it cannot take action, hold state, or pursue goals. It is a "word spitter."

The four layers of control logic

  • Layer 0 — autoregression and context management: repeatedly calls the model to build a full response; appends conversation history to simulate memory. Example: basic ChatGPT.
  • Layer 1 — transformation and actuation: rewrites the user's input before it reaches the model (e.g. fetching live web results); takes action based on model output (e.g. booking flights, editing a Word document). Examples: Gemini web search, early OpenAI plugins, Microsoft Copilot.
  • Layer 2 — stateful planning: control logic maintains a complex plan, makes many sequential model calls, tests outputs, and iterates. Examples: Meta's Cicero (diplomacy bot), Devin (AI coding agent).
  • Layer 3 (hypothetical) — would involve a master control logic orchestrating multiple models with persistent world-state, approaching AGI. Not yet built; no one is close.

Why the control logic is the key safety lever

  • Layers 0–2 are hand-coded by humans. Developers know exactly what they do.
  • The language model cannot "break through" and commandeer the control logic — it can only output tokens.
  • Cicero example: developers simply excluded lying moves from the simulation. Easy, because the planner is their own code.
  • The runaway self-improvement scenario (layer-3 control logic that rewrites itself) requires deliberate, complex engineering that no one is currently building or knows how to build.
  • Practical risks are real but mundane: missing guardrails (e.g. no spending cap → $20k Emirates booking; no resource limit → runaway compute loop). These are engineering oversights, not emergent intelligence.

Intentional AI (IAI)

  • Newport's framing: IAI — the control logic can encode enormous intention and values even when the underlying generative model is uninterpretable.
  • Policy implication: developers should be held liable for their system's actuations. Liability shifts attention to the control layer — where it belongs.
  • Key constraints to enforce in control logic: spending limits, no autonomous self-modification, restricted access to computational resources, no missile or weapons actuation without human oversight.

AI disinformation: the real and limited risk

  • Disinformation requires two things: a large pool of potentially viral content and a recommendation algorithm to surface the stickiest items.
  • LLMs expand the pool of bad information, but for high-profile topics the pool is already full. Adding mediocre AI-generated content does not displace human-crafted sticky content.
  • The real risk is hyper-targeted disinformation on niche topics where the pool was previously empty (e.g. a specific county election). A low-skill actor can now produce plausible content where none existed before.
  • Mitigation: the same internet literacy that has been needed for 15 years; no fundamentally new solution exists.

AI capabilities and the current form factor

  • Measurement of model-to-model improvements is vague today because the "mega-oracle" chatbot is not the end-state product.
  • The trajectory is toward smaller, specialised, actuated models integrated into existing tools (GitHub Copilot, Apple Intelligence, voice interfaces).
  • As models become task-specific, capabilities will be more clearly enumerable.
  • After 18 months, the chat-interface form factor has produced limited real-world disruption. Newport compares it to the Mosaic browser era — impressive but not yet the viral vector for adoption.

Pseudo-productivity and the mouse jiggler problem

  • Knowledge work since the 1950s has used pseudo-productivity: visible activity as a proxy for useful effort, because there were no better metrics.
  • Mobile connectivity made pseudo-productivity pathological: workers are now expected to demonstrate effort at all hours via email and Slack replies.
  • The mouse jiggler (software that simulates mouse movement to keep Slack status "active") is the absurdist endpoint of this dynamic.
  • Fix: replace pseudo-productivity with results-oriented management — fewer concurrent projects, sequential focus, quality over busyness. Outlined in Slow Productivity.

Distributed webs of trust vs. recommendation algorithms

  • Recommendation algorithms + user-generated content + popularity feedback = a loop that optimises for hyper-palatable, amygdala-targeting content.
  • Podcasts and email newsletters already operate on distributed webs of trust: discovery happens through human referrals, not algorithmic push. This model works.
  • Breaking discovery and consumption apart: consumption is solved (RSS, podcast apps); discovery should move back toward human-to-human curation.
  • Recommendation algorithms are fine in closed contexts (Netflix, Amazon) where they are not coupled with user-generated content and popularity feedback.

Workload management for overloaded teams

  • Pull-based, Kanban-style project boards prevent log-jams caused by too many concurrent projects.
  • Each person works on one or two things at a time; new work is pulled when a slot opens.
  • Projects that sit unpulled for a month are removed — filters momentary enthusiasm from genuine priorities.
  • Transparent workload visibility replaces the hive-mind of constant reactive messaging.

More like this — when you're ready for early access.

Join the waitlist for a personal account and content recommendations based on what you're working on.

No spam. Unsubscribe at any time.

You're on the list. We'll be in touch before launch.

Get early access to the full library.

Join the waitlist for a personal account and content recommendations based on what you're working on.

No spam. Unsubscribe at any time.

You're on the list. We'll be in touch before launch.

Be among the first to get personalised recommendations tailored to your stage in business.

No spam.

You're on the list. We'll be in touch before launch.

Be among the first to get personalised recommendations tailored to your stage in business.

No spam.

You're on the list. We'll be in touch before launch.