Why AI will not automate most office jobs within 12 months

Original source details coming soon.

Executive overview

Microsoft CEO Mustafa Suleyman claimed in February that AI would fully automate most white-collar knowledge work within 12 to 18 months. This prediction is an outlier — contradicted by other AI leaders, unsupported by the actual pace of LLM progress, and constrained by fundamental technical limits.

LLMs are story completers with harnesses on top — not general reasoning engines — and that architecture cannot automate most knowledge work at the pace Suleyman claims.

Why other AI leaders disagree

  • Dario Amodei's most pessimistic estimate: 50% of entry-level jobs affected within five years — far less drastic than Suleyman's claim.
  • Jensen Huang (Nvidia) argues the narrative of AI destroying jobs is "false" and counterproductive.
  • Huang's analogy: AI tools will change jobs the way computers did in the 1990s — not wholesale replace them.
  • Nvidia's own engineering teams use extensive AI tooling and are hiring more engineers than ever.

Why progress is too slow

  • Since late 2024, LLM improvements have been incremental — benchmark gains, not functional leaps.
  • The GPT-2 to GPT-4 era of obvious capability jumps is over; scaling alone stopped yielding new abilities.
  • Post-training and fine-tuning — the current dominant approach — only works well where highly structured datasets exist (math, code).
  • Recent model releases show one step forward, one step back: Opus 4.7 widely reported as a regression; GPT-5 improvements described as polish, not breakthroughs.

Why coding agents don't generalise

  • The rise of coding agents was driven by years of work on the coding harness — the non-AI software wrapping the LLM — not by smarter models alone.
  • Harnesses for coding work because the task space is narrow, verifiable, and the companies building AI tools are experts in software.
  • Replicating this for other job types requires dedicated multi-year teams per domain — those teams don't exist.
  • For Suleyman's prediction to hold, thousands of such teams would need to be working in parallel right now on every major knowledge work category.

Technical limits of LLMs

  • An LLM predicts the next token autoregressively — it is a story completer, not a planner with a world model.
  • Scaling hit a wall by mid-2024; further gains now require task-specific fine-tuning on structured data.
  • LLM-generated plans sound reasonable but aren't correct — they lack internal simulation, hard-coded consistency, or the ability to evaluate outcomes.
  • Non-coding agents fail because ambiguous knowledge work produces "reasonable sounding" plans that agents execute blindly and incorrectly.
  • Even coding agents require constant human tweaking; most knowledge workers lack the technical skills to manage this.
  • OpenAI quietly scaled back non-coding agent projects in late 2024 because they weren't working.

Where LLMs are genuinely useful today

  • Sifting moderate amounts of text for summaries or examples — attention layers make this a real strength.
  • Reformatting and cleaning data: consumer comments into bullets, messy spreadsheet data into structured output.
  • Coding agents generating small scripts to process large datasets precisely — useful for technical users.
  • Better-than-Google search: query plus LLM summary is genuinely valuable for information retrieval.
  • Calendar and appointment management — narrow, well-structured, good fit for LLM reasoning.
  • Email filtering using natural language rules — works well despite token cost.

What LLMs shouldn't be used for

  • Writing slide decks or emails on behalf of the user — if an LLM can write it, the content probably didn't need to exist.
  • Refining thinking: LLMs are sycophantic, prone to hallucination, and emotionally manipulative — poor substitutes for reading, writing, or talking to real people.

The conspiracy

  • The Suleyman Financial Times interview clip making the 12–18 month claim has been quietly edited out of the official FT video.
  • The edit is visible as an awkward jump cut mid-interview.
  • The clip had already been widely copied, clipped, and reported on — so the edit was too late.
  • Most likely explanation: the claim was recognised as too extreme after the fact and removed to limit reputational damage.
  • Broader pattern: AI CEOs benefit from making dramatic job-displacement claims — it attracts investors and elevates the perceived stakes of their work, with little accountability when predictions don't materialise.

More like this — when you're ready for early access.

Join the waitlist for a personal account and content recommendations based on what you're working on.

No spam. Unsubscribe at any time.

You're on the list. We'll be in touch before launch.

Get early access to the full library.

Join the waitlist for a personal account and content recommendations based on what you're working on.

No spam. Unsubscribe at any time.

You're on the list. We'll be in touch before launch.

Be among the first to get personalised recommendations tailored to your stage in business.

No spam.

You're on the list. We'll be in touch before launch.

Be among the first to get personalised recommendations tailored to your stage in business.

No spam.

You're on the list. We'll be in touch before launch.