Anthropic's CPO Mike Krieger on AI timelines, product strategy, and the future of building

Executive overview

AI is writing 90% of code at Anthropic today — and the bottlenecks have shifted entirely. Engineering speed is no longer the constraint; decision-making, alignment, and merge infrastructure are.

Mike Krieger, CPO at Anthropic and co-founder of Instagram, reflects on a year inside the world's most AI-native product team. The core challenge is no longer building — it's knowing what to build, getting everyone aligned fast enough, and shipping coherently when output volume has exploded.

The biggest unlock isn't faster coding — it's embedding product people directly into model post-training, not just the product layer.

What's changed about AI capabilities

  • Claude Opus 4 was the first model to return genuinely novel angles on product strategy — not just affirmations
  • Dario's timeline predictions keep coming true; Sweebench coding benchmark went from 50% to ~72% as predicted
  • Agentic behavior, persistent memory, and long-horizon tasks are converging faster than expected
  • The AI 2027 paper felt less like speculation and more like a product roadmap when read alongside internal strategy docs

How product development changes at 90% AI-written code

  • The functional unit of work has shifted: PMs and designers now prototype functional demos directly, before engineering is involved
  • New bottlenecks: upstream alignment on what to build, and downstream merge queue infrastructure
  • Anthropic had to fully re-architect its merge queue — volume of pull requests blew past all expectations
  • Review processes have changed: Claude Code team uses a separate Claude instance to review PRs, then humans do acceptance testing rather than line-by-line review
  • The skills that remain hard: knowing how to structure a problem, composing the right question, thinking through backend/frontend architecture
  • Claude Code (written ~95% by Claude) accepts contributions from engineers with no TypeScript knowledge — they just talk to Claude and submit PRs

Where product teams still create irreplaceable value

  • Comprehensibility: the gap between what models can do and how most people actually use them is enormous
  • Strategy: deciding where to play, what to focus on, and how to position — can't be automated yet
  • Opening people's eyes: live demos still trigger "aha" moments that unlock adoption far beyond current usage
  • Human empathy and psychology — understanding real user needs — remains a deep, durable skill

Embedding product in model training — the core strategic insight

  • Product teams working alongside model researchers in post-training generate far more leverage than product teams working only on UX
  • Artifacts on Claude 4 was built this way: a Claude Skills team member (handling post-training) paired with product, not just prompting the model
  • The functional unit at Anthropic is now: be in the post-training conversation, then build, then feed results back
  • PMs who get it are already embedded in researcher conversations weeks before product reviews

MCP and the future of context

  • Mike's mental model for AI product utility: model intelligence × context/memory × applications and UI — all three must converge
  • MCP targets the middle layer: getting the right context into models reliably and repeatably
  • Every integration was being rebuilt from scratch before MCP; the protocol made them composable and reusable across Claude, ChatGPT, Gemini
  • Key insight: commoditising integrations benefits foundational model companies — more integrations = more useful models
  • Goal: expose every Claude.ai primitive (projects, artifacts, styles, conversations) as an MCP so Claude itself can write back to them
  • Computer use is one approach; MCP-first is the preferred direction — everything becomes scriptable and composable

Anthropic's competitive positioning vs. OpenAI

  • ChatGPT owns consumer mindshare; Anthropic owns developer and builder mindshare
  • Consumer hit products are "lightning in a bottle" — building strategy around chasing one is probably wrong
  • The stronger play: lean into the builder identity — engineers, founders, creators, tinkerers who want to work at the frontier
  • The Rick Rubin vibe-coding collaboration is a signal of that brand direction
  • Don't try to beat competitors at their own game; figure out what you can uniquely be

Where AI founders should build

  • Deep domain knowledge: understanding of a specific vertical (legal, biotech, healthcare) that can't be replicated quickly — Harvey as an example
  • Differentiated go-to-market: knowing not just which company to sell to, but exactly which person inside it
  • Novel form factors: new interfaces for AI that incumbents can't easily copy because users already have fixed expectations of existing products
  • Startup energy — existential urgency — remains a real, uncopiable advantage

Why Artifact was shut down

  • Mobile web deterioration made the reading experience jarring — interstitial ads, signup walls — and ad-blocking felt ethically wrong
  • News is personal and didn't spread naturally; sharing felt contrived and attempts to fix it crossed ethical lines the team didn't want to cross
  • Fully remote founding team made it hard to navigate major strategic pivots — no equivalent to whiteboard moments over burritos at 11pm
  • Growth was 10 units of input for 1 unit of output; the energy wasn't there
  • Positive reception to the shutdown: other founders said it freed them to make similar calls earlier

Prompting advice from the CPO

  • Use "think hard" in Claude Code to trigger deeper reasoning
  • Ask Claude to roast or be brutal rather than asking what could be better — it forces more critical output
  • The Prompt Improver in Anthropic's console uses Claude agentically to generate and iterate on prompts — output often surprises even experienced users
  • Claude itself is a very good prompter of Claude

On metrics and what Claude asked Mike

Claude's questions to Mike via the podcast: how do you preserve user agency rather than creating dependency, and how do you measure a good two-message conversation vs. a 200-message one?

  • Sycophancy and conversation-prolonging are real risks if engagement metrics are overweighted
  • The right North Star: did Claude help you get work done, unlock creativity, and give you more space in your life for other things?
  • "Not everything meaningful shows up in metrics" — the quiet 3am conversation matters even if it doesn't move a dashboard

More like this — when you're ready for early access.

Join the waitlist for a personal account and content recommendations based on what you're working on.

No spam. Unsubscribe at any time.

You're on the list. We'll be in touch before launch.

Get early access to the full library.

Join the waitlist for a personal account and content recommendations based on what you're working on.

No spam. Unsubscribe at any time.

You're on the list. We'll be in touch before launch.

Be among the first to get personalised recommendations tailored to your stage in business.

No spam.

You're on the list. We'll be in touch before launch.

Be among the first to get personalised recommendations tailored to your stage in business.

No spam.

You're on the list. We'll be in touch before launch.