The original is one click away. Open original ↗
How to prompt reasoning models for better outputs
Executive overview
Habits formed with standard chat models (ChatGPT, Claude) produce poor results when applied to reasoning models (O1, O3, DeepSeek). Reasoning models need dense, structured prompts — not quick back-and-forth exchanges.
Think email, not text. Think onboarding a new hire, not firing off a Slack message.
Reasoning models reward context and structure; the same prompt that works for GPT-4 will underperform in O1.
The six-section prompt template
- Goals — 1–2 sentences on what you want. Focus on the what, not the how. Avoid personas and style instructions; reasoning models have their own approach.
- Response format — Specify the output structure explicitly (list, pros/cons, diff). Without this, models default to headers and sub-headers every time. For coding: request a diff, not a full file rewrite.
- Warnings — Include recurring errors from past runs and any output constraints. Functions like a "remember to avoid X" block baked into the prompt rather than added as an afterthought.
- Context — The most important section. Provide more than feels necessary. Use voice memos or dictation for a brain dump; imperfect transcription is fine, models handle it. For OpenAI models, place context at the bottom of the prompt (cache-friendliness); for Claude, reportedly at the top.
- Separators — Use Markdown delimiters for O1/O3; XML tags for DeepSeek. These help the model distinguish sections.
- No chain-of-thought prompting — Drop "think step by step." It is already baked into reasoning models and adds no value.
Using AI to build the prompt
Constructing a full structured prompt manually is tedious. A faster workflow:
- Feed the template image into Claude and ask it to convert it to a structured Markdown template
- Then ask it to fill in the template for your specific task
- For coding: include relevant code snippets, file names, console errors, and a voice memo context dump
- Feed the finished Markdown file into O1/O3 for the actual task
Writing style transfer with self-critique
A technique for getting reasoning models to match a human writing style:
- Have O1 write on a topic; write on the same topic yourself separately
- Feed both into O1 and ask it to compare them, treating your version as the target
- O1 produces a diff: what differs and what needs to improve
- Embed that diff into the system prompt
- O1 then critiques its own output against the diff on each new run, iterating until output matches the target style
This applies the "LLM as a judge" pattern — normally used externally in eval pipelines — directly inside the system prompt.
Three use cases where reasoning models excel
- Hard bugs — Build a structured prompt in Cursor/Windsurf, feed it to O1 with diff output requested, paste the result back. Effective for bugs that standard completions can't resolve.
- Implementation planning — Draft a plan in GPT-4 or Claude, refine it in O1/O3/Gemini Thinking, then pipe back to GPT-4 for clean Markdown output. Produces a structured plan suitable for feeding into Cursor for large-scale builds.
- Pros and cons analysis — Provide a 3–5 minute voice memo as context, ask for structured pros/cons. More thorough than standard models for consequential decisions.
More like this — when you're ready for early access.
Join the waitlist for a personal account and content recommendations based on what you're working on.
No spam. Unsubscribe at any time.
You're on the list. We'll be in touch before launch.