The original is one click away. Open original ↗
Why AI fails and how to diagnose it in four layers
Executive overview
Most AI failures trace back to one of four causes — bad input, bad context, wrong model, or a genuine capability wall. The failure stack is a four-layer diagnostic you work through in order, bottom to top.
Fix the first two layers and you resolve 80–90% of problems before touching model selection.
The failure stack works because it forces you to separate what you gave the AI from what you asked it to do — two errors most people conflate.
Layer 1: Input — file type, complexity, and size
- AI does not always read every part of a file; it may silently skip sections.
- Spot-check by asking the AI to list all headings and describe images — gaps reveal what it missed.
- If only a subset of a document matters, give it only that subset, or name the specific pages.
- If relevant content is scattered throughout, focus attention by topic, not page range.
- Reduce hallucination with two grounding sentences appended to any prompt:
- "Base your answers only on what's in the file I gave you."
- "If you're unsure or the file doesn't contain enough information, say so instead of guessing."
Layer 2: Context — prompt quality and task complexity
- Treat the AI like a smart new employee: a vague sticky note produces a vague answer.
- Use the WWH framework to structure any prompt: What (exact task), Why (your intent), How (constraints — format, tone, length).
- Providing intent (the "Why") causes the AI to fill gaps you forgot to specify.
- If a task has 10+ steps and the AI fails, break it into chunks of 3 and work sequentially.
- If 3 steps is still too much, drop to 1 task at a time.
Layer 3: Model — using the right tool
- Every model has distinct strengths; using the wrong one is like using a screwdriver to hammer a nail.
- For complex PDFs with OCR, images, and annotations, Gemini 2.0 Pro tends to outperform others.
- For writing that matches a specific tone, Claude Opus outperforms GPT.
- If a prompt fails consistently on one model, paste the identical prompt and input into another before concluding AI can't do it.
Layer 4: The wall — actual capability limits
- If you've optimised input, context, and tried multiple models and still get consistently bad output, you've hit a genuine wall.
- Break the task into subtasks and categorise each: AI can do this / AI cannot do this.
- Outsource the subtasks AI handles well; add the rest to an AI wishlist.
- Revisit the wishlist monthly — new models and features ship constantly, and a task that was impossible six months ago may be solved today.
More like this — when you're ready for early access.
Join the waitlist for a personal account and content recommendations based on what you're working on.
No spam. Unsubscribe at any time.
You're on the list. We'll be in touch before launch.