Moving from opinion-based to evidence-guided product development

Executive overview

Most product teams believe they are data-driven — they are not. The real pattern is opinion-based development: a confident idea gets built at scale, fails late, and wastes enormous resources. Google+ consumed roughly a thousand people and was shut down in 2018; Gmail's tabbed inbox, built with relentless evidence-gathering, reached 1.8 billion users.

The GIST framework (Goals, Ideas, Steps, Tasks) gives teams a structured way to balance human judgment with evidence at every stage of product development — from setting goals to managing individual work.

Signs your team is not evidence-guided

  • Goals are vague, numerous, or output-focused rather than outcome-focused
  • No user-facing metrics — only revenue or business KPIs
  • Heavy roadmapping effort, little experimentation, less learning
  • Engineers are delivery-focused and disengaged from users and outcomes

Goals: metrics trees and the North Star

  • North Star metric measures value created for users (e.g. WhatsApp: messages sent; Airbnb: nights booked)
  • Top KPI measures value captured by the business (revenue, profit)
  • Metrics trees break both metrics into sub-metrics, showing which levers move the needle
  • Teams can own sub-metrics, creating alignment and a sense of mission
  • Metrics trees reveal which team topologies make sense — structure follows goals, not org charts
  • OKRs become more powerful when populated from metrics trees, not from roadmap items

Ideas: ICE scoring and the confidence meter

  • ICE (Impact, Confidence, Ease) provides a consistent, transparent way to evaluate competing ideas
  • Impact is assessed against a clearly defined goal — without clear goals, impact scores are meaningless
  • Ease is the inverse of effort; folding Reach into Impact (vs. RICE) keeps the model simpler
  • The critical variable is Confidence: how well-evidenced is the impact estimate?
  • The confidence meter maps evidence quality from 0–10:
    • 0–0.1: opinions only — self-conviction, pitch decks, thematic alignment (AI/blockchain), strategy fit
    • Slightly higher: peer review, back-of-envelope estimates, anecdotal data, single competitor having the feature
    • Medium: surveys, deep competitive analysis, structured user research
    • High (red zone): actual tests — fake doors, smoke tests, usability studies, A/B experiments
  • Teams tend to assign high confidence to low-evidence ideas; the meter makes this visible
  • Use the confidence meter as a tool to say no — or to decide how much to invest before building

Steps: validating ideas before committing to build

  • The AFTER model: Assessment → Fact-finding → Tests → Experiments → Release
  • Assessment (no build required): goal alignment check, ICE scoring, assumption mapping, stakeholder review
  • Fact-finding: data analysis, surveys, competitive research, user interviews, field observation
  • Fake tests (minimal build): fake door tests, smoke tests, Wizard of Oz, concierge tests, usability tests on mockups
    • Gmail's tabbed inbox was first validated by manually resorting 50 emails in a facade — no production code written
  • Mid-level tests: early adopter programs, alphas, fish food (team dog-fooding), multitudinal user studies
  • Full tests: dog-fooding, betas, previews, labs
  • Experiments (controlled): A/B tests, multivariate tests — these require a control group
  • Release: staged rollout, percent launches, holdbacks — still an opportunity to learn
  • Key principle: start cheap, park bad ideas early, invest more only when evidence accumulates
  • Time-to-outcome, not time-to-production, is the right metric for speed

The GIST board: connecting goals to daily work

  • Most teams operate in two disconnected worlds: planning (managers, PMs) and delivery (engineers in Jira)
  • PMs act as a fragile bridge — overloaded, unable to do research or discovery
  • The GIST board per team shows: up to four key results (goals), current ideas with ICE scores, and the next validation steps
  • Teams review it at least every two weeks: are we working on the right ideas? How are results tracking?
  • Steps on the board are learning milestones, not engineering milestones
  • Outcome roadmaps replace feature roadmaps: "by Q4 we want to reduce churn" not "by Q4 we launch X"
  • Once an idea reaches high confidence, switch to delivery and add it to the timeline

Where to start

  • Fix the biggest problem first — don't try to transform everything at once
  • Unclear goals → start with metrics trees and North Star
  • Constant debates and shifting priorities → start with ICE and the confidence meter
  • Building too much, learning too little → start with the steps layer
  • Disengaged engineers → start with the GIST board and task layer
  • Evidence-guided methods are faster and more resource-efficient than opinion-based development — they fail earlier and cheaper

More like this — when you're ready for early access.

Join the waitlist for a personal account and content recommendations based on what you're working on.

No spam. Unsubscribe at any time.

You're on the list. We'll be in touch before launch.

Get early access to the full library.

Join the waitlist for a personal account and content recommendations based on what you're working on.

No spam. Unsubscribe at any time.

You're on the list. We'll be in touch before launch.

Be among the first to get personalised recommendations tailored to your stage in business.

No spam.

You're on the list. We'll be in touch before launch.

Be among the first to get personalised recommendations tailored to your stage in business.

No spam.

You're on the list. We'll be in touch before launch.