Why AI fails and how to diagnose it in four layers

Executive overview

Most AI failures trace back to one of four causes — bad input, bad context, wrong model, or a genuine capability wall. The failure stack is a four-layer diagnostic you work through in order, bottom to top.

Fix the first two layers and you resolve 80–90% of problems before touching model selection.

The failure stack works because it forces you to separate what you gave the AI from what you asked it to do — two errors most people conflate.

Layer 1: Input — file type, complexity, and size

  • AI does not always read every part of a file; it may silently skip sections.
  • Spot-check by asking the AI to list all headings and describe images — gaps reveal what it missed.
  • If only a subset of a document matters, give it only that subset, or name the specific pages.
  • If relevant content is scattered throughout, focus attention by topic, not page range.
  • Reduce hallucination with two grounding sentences appended to any prompt:
    1. "Base your answers only on what's in the file I gave you."
    2. "If you're unsure or the file doesn't contain enough information, say so instead of guessing."

Layer 2: Context — prompt quality and task complexity

  • Treat the AI like a smart new employee: a vague sticky note produces a vague answer.
  • Use the WWH framework to structure any prompt: What (exact task), Why (your intent), How (constraints — format, tone, length).
  • Providing intent (the "Why") causes the AI to fill gaps you forgot to specify.
  • If a task has 10+ steps and the AI fails, break it into chunks of 3 and work sequentially.
  • If 3 steps is still too much, drop to 1 task at a time.

Layer 3: Model — using the right tool

  • Every model has distinct strengths; using the wrong one is like using a screwdriver to hammer a nail.
  • For complex PDFs with OCR, images, and annotations, Gemini 2.0 Pro tends to outperform others.
  • For writing that matches a specific tone, Claude Opus outperforms GPT.
  • If a prompt fails consistently on one model, paste the identical prompt and input into another before concluding AI can't do it.

Layer 4: The wall — actual capability limits

  • If you've optimised input, context, and tried multiple models and still get consistently bad output, you've hit a genuine wall.
  • Break the task into subtasks and categorise each: AI can do this / AI cannot do this.
  • Outsource the subtasks AI handles well; add the rest to an AI wishlist.
  • Revisit the wishlist monthly — new models and features ship constantly, and a task that was impossible six months ago may be solved today.

More like this — when you're ready for early access.

Join the waitlist for a personal account and content recommendations based on what you're working on.

No spam. Unsubscribe at any time.

You're on the list. We'll be in touch before launch.

Get early access to the full library.

Join the waitlist for a personal account and content recommendations based on what you're working on.

No spam. Unsubscribe at any time.

You're on the list. We'll be in touch before launch.

Be among the first to get personalised recommendations tailored to your stage in business.

No spam.

You're on the list. We'll be in touch before launch.

Be among the first to get personalised recommendations tailored to your stage in business.

No spam.

You're on the list. We'll be in touch before launch.