Three context engineering mistakes that degrade AI performance

Executive overview

Bad context degrades even the best models — accuracy can drop by up to 30%. The three failure modes are context clashing (contradictory inputs), context confusion (too many tools), and context distraction (bloated windows that cause repetitive behaviour).

Each mistake has a direct fix: front-load coherent context, limit and dynamically load tools, and offload or summarise history rather than letting it accumulate.

Garbage in, garbage out still holds — and a bloated context window is a form of garbage.

Context clashing: contradictory inputs hurt accuracy

  • Incrementally adding context across a long conversation introduces contradictions that degrade responses.
  • Research showed accuracy drop from 98% to 64% on high-quality models with contradictory sharded input.
  • In vibe coding, repeated failed attempts compound: the longer the conversation runs, the worse the fixes get.
  • Fix: front-load all relevant context in a single message rather than drip-feeding it.
  • For complex inputs, run a brain dump through a summarising LLM first to detect and flag contradictions before passing to the working model.
  • Always audit system prompts for conflicting instructions — modern models follow instructions precisely, so any contradiction will be acted on.

Context confusion: too many tools cause wrong calls

  • As the number of available tools grows, the rate of incorrect tool selection increases measurably.
  • The practical ceiling for today's models is around 20 tools; beyond that, accuracy degrades noticeably.
  • Similarity between tool names compounds the problem — the model struggles to distinguish near-identical options.
  • Fix 1 — tool loadout: only inject the tools relevant to the current task, not the full set.
  • RAG-MCP implements this: a smart retriever uses semantic search to pull 5–15 matching tools from an external database, then passes only those to the working model.
  • Fix 2 — context offloading: keep long-term or infrequently needed data outside the context window and retrieve it on demand.

Context distraction: long windows cause repetitive behaviour

  • Large context windows are not inherently better — Gemini 2.5 Pro begins repeating past actions once the context exceeds 100k tokens.
  • The model fixates on previous steps rather than generating novel solutions, producing a loop of ineffective repetition.
  • Fix: summarise previous context through a separate LLM before passing it to the working model, keeping only what is relevant.
  • In vibe coding: start a fresh conversation for each new feature or unresolved bug; summarise what was achieved and carry only that forward.
  • Offloading also applies here — externalise history and retrieve selectively rather than accumulating it inline.

Context offloading: shared fix across confusion and distraction

  • ChatGPT memory: saves user-specific context externally and retrieves it when relevant, avoiding persistent bloat.
  • Claude's think tool: routes extended reasoning to an external scratch pad; only the distilled output re-enters the main context window.
  • To-do / planning documents: a macro plan lives outside the conversation; the model checks it off task by task, starting a fresh context window for each, with only a compressed summary of prior progress carried forward.

More like this — when you're ready for early access.

Join the waitlist for a personal account and content recommendations based on what you're working on.

No spam. Unsubscribe at any time.

You're on the list. We'll be in touch before launch.

Get early access to the full library.

Join the waitlist for a personal account and content recommendations based on what you're working on.

No spam. Unsubscribe at any time.

You're on the list. We'll be in touch before launch.

Be among the first to get personalised recommendations tailored to your stage in business.

No spam.

You're on the list. We'll be in touch before launch.

Be among the first to get personalised recommendations tailored to your stage in business.

No spam.

You're on the list. We'll be in touch before launch.