The original is one click away. Open original ↗
Three context engineering mistakes that degrade AI performance
Executive overview
Bad context degrades even the best models — accuracy can drop by up to 30%. The three failure modes are context clashing (contradictory inputs), context confusion (too many tools), and context distraction (bloated windows that cause repetitive behaviour).
Each mistake has a direct fix: front-load coherent context, limit and dynamically load tools, and offload or summarise history rather than letting it accumulate.
Garbage in, garbage out still holds — and a bloated context window is a form of garbage.
Context clashing: contradictory inputs hurt accuracy
- Incrementally adding context across a long conversation introduces contradictions that degrade responses.
- Research showed accuracy drop from 98% to 64% on high-quality models with contradictory sharded input.
- In vibe coding, repeated failed attempts compound: the longer the conversation runs, the worse the fixes get.
- Fix: front-load all relevant context in a single message rather than drip-feeding it.
- For complex inputs, run a brain dump through a summarising LLM first to detect and flag contradictions before passing to the working model.
- Always audit system prompts for conflicting instructions — modern models follow instructions precisely, so any contradiction will be acted on.
Context confusion: too many tools cause wrong calls
- As the number of available tools grows, the rate of incorrect tool selection increases measurably.
- The practical ceiling for today's models is around 20 tools; beyond that, accuracy degrades noticeably.
- Similarity between tool names compounds the problem — the model struggles to distinguish near-identical options.
- Fix 1 — tool loadout: only inject the tools relevant to the current task, not the full set.
- RAG-MCP implements this: a smart retriever uses semantic search to pull 5–15 matching tools from an external database, then passes only those to the working model.
- Fix 2 — context offloading: keep long-term or infrequently needed data outside the context window and retrieve it on demand.
Context distraction: long windows cause repetitive behaviour
- Large context windows are not inherently better — Gemini 2.5 Pro begins repeating past actions once the context exceeds 100k tokens.
- The model fixates on previous steps rather than generating novel solutions, producing a loop of ineffective repetition.
- Fix: summarise previous context through a separate LLM before passing it to the working model, keeping only what is relevant.
- In vibe coding: start a fresh conversation for each new feature or unresolved bug; summarise what was achieved and carry only that forward.
- Offloading also applies here — externalise history and retrieve selectively rather than accumulating it inline.
Context offloading: shared fix across confusion and distraction
- ChatGPT memory: saves user-specific context externally and retrieves it when relevant, avoiding persistent bloat.
- Claude's think tool: routes extended reasoning to an external scratch pad; only the distilled output re-enters the main context window.
- To-do / planning documents: a macro plan lives outside the conversation; the model checks it off task by task, starting a fresh context window for each, with only a compressed summary of prior progress carried forward.
More like this — when you're ready for early access.
Join the waitlist for a personal account and content recommendations based on what you're working on.
No spam. Unsubscribe at any time.
You're on the list. We'll be in touch before launch.