Building AI apps with three documents: spec, blueprint, and to-dos

Executive overview

Every AI-assisted app build starts with three documents — not code — that give the AI precise context at every stage. The spec defines what to build, the blueprint defines how to build it with copy-paste prompts, and the to-do list compensates for the AI's fading memory across conversations. Non-technical builders can create all three using AI itself, making the system accessible without coding knowledge. Choosing the right model at each stage and persisting through errors — then embedding lessons into a rules file — is what separates working apps from abandoned prototypes.

The specification: interview the AI to define what you're building

  • Use a structured interview prompt that forces the AI to ask one question at a time, back and forth 15–20 times.
  • Speak your answers using dictation to give as much context as possible.
  • If the AI asks something you can't answer, tell it to answer its own question given the constraints you've provided.
  • Resist the AI's suggestions to add features — keep the scope to the simplest version that solves your problem.
  • Use ChatGPT in auto mode for speed during the interview; switch to thinking mode at the very end to produce a thorough spec.
  • The finished spec is a step-by-step document you could hand to a developer.

The blueprint: a phased build plan with embedded copy-paste prompts

  • Paste a detailed prompt plus your spec into a high-end model (Claude Opus 4.5 recommended for longest output window).
  • The blueprint breaks the build into phases, each phase into small steps, each step into a prompt you copy-paste into your coding AI.
  • Each step should be small enough for an AI to implement and safely test in isolation.
  • Specify real API calls and real data in tests — mock data causes tests to pass while the app still fails.
  • The coding AI should receive all three documents (spec, blueprint, to-dos) every time it starts a new task.

The to-do list: a persistent roadmap that extends AI memory

  • AI context fades over long sessions and across new conversations — the to-do list counteracts this.
  • Generate the to-do list by appending a single one-liner to the bottom of the blueprint prompt.
  • The output is a markdown checkbox list organised by phase and step.
  • Start a new conversation after each completed step to refresh the AI's memory; it reads the to-do list to see what's done and what's next.
  • Checked boxes let the AI orient itself without needing to remember the full build history.

Choosing the right tool and model for code generation

  • Cursor is the recommended coding environment — better UI than alternatives and production-ready output.
  • Replit Agent and Google AI Studio are useful for prototypes but not production apps; AI Studio is limited to Gemini only.
  • Current daily driver: Codex (GPT-5.2) via the Cursor plugin — frequently one-shots entire phases with no errors.
  • Claude Opus 4.5 excels at front-end work and visual taste.
  • Gemini 3 Pro is best for hard bugs and UI testing — it can click through the app, collect errors, and self-correct.
  • Model recommendations change monthly; A/B test regularly as new releases drop.

Handling errors and embedding lessons

  • Errors are inevitable — persistence is the skill, not avoiding errors entirely.
  • Most errors stem from model knowledge cutoffs: the AI doesn't know about APIs or models released after its training date.
  • Fix: feed the AI official documentation for any new API or model version you need it to use.
  • Once an error is resolved, ask the AI to add a brief, information-dense lesson to your rules file so future sessions never repeat it.
  • Rules file location depends on tool: agents.md for Codex in Cursor, .cursor/rules for native Cursor, CLAUDE.md for Claude Code.
  • Keep the rules file concise — it loads into context every session, so bloat wastes memory.

Scaling up: bigger chunks as models improve

  • Four months ago you had to feed the blueprint one prompt at a time; current models can handle an entire phase in a single prompt.
  • A full phase can take 25–45 minutes to complete, but Codex typically gets it right first time, making the wait worthwhile.
  • As models improve, the right chunk size will keep growing — test regularly to find the new ceiling.

More like this — when you're ready for early access.

Join the waitlist for a personal account and content recommendations based on what you're working on.

No spam. Unsubscribe at any time.

You're on the list. We'll be in touch before launch.

Get early access to the full library.

Join the waitlist for a personal account and content recommendations based on what you're working on.

No spam. Unsubscribe at any time.

You're on the list. We'll be in touch before launch.

Be among the first to get personalised recommendations tailored to your stage in business.

No spam.

You're on the list. We'll be in touch before launch.

Be among the first to get personalised recommendations tailored to your stage in business.

No spam.

You're on the list. We'll be in touch before launch.