A four-step framework for selecting and scaling AI use cases

Executive overview

Most AI initiatives fail not because the technology is wrong, but because teams skip problem definition and jump straight to building. The result: expensive systems nobody adopts.

Mike Schubert's four-step framework — identify the problem, ratify metrics, test and learn, scale — forces discipline before any dollar is spent on development. Metrics must be agreed before testing begins, not after. Test-and-learn is deliberately small and cheap, designed to surface weak use cases early rather than at the $10M mark.

The core insight: lead with the business problem, not the technology — and set your success criteria before you start, not after.

The four steps

  1. Identify the problem — Define what it is, who experiences it, and where in the workflow it manifests. Replay it back to stakeholders to confirm alignment. Iterate until agreement is solid.
  2. Ratify metrics — Agree on the KPIs that define success before any build begins. This educates stakeholders on realistic AI accuracy expectations and creates accountability.
  3. Test and learn — Spend as little time and money as possible proving feasibility. Use a small, representative data set or user group. A/B testing against a control group is the gold standard.
  4. Scale — Operationalise the model, drive adoption through change leadership, and instrument a real-time dashboard visible to all stakeholders.

Why metrics must come before testing

  • Without pre-agreed criteria, teams struggle to call a result good or bad — 65% accuracy feels fine until you realise you needed 90%.
  • Pre-set metrics start the negotiation with stakeholders on what success looks like before any money is spent.
  • They remove the ego problem: if a use case doesn't hit the threshold, the team can move on without it feeling like personal failure.
  • Finance will eventually ask what they got for the investment — metrics make that story tellable.
  • Baseline human error rates matter: if humans doing the task have a 7–10% error rate, that shapes what AI accuracy is actually required.

Test and learn in practice

  • Run in parallel where possible — a control group following the normal process and a test group using the model gives a genuine comparison.
  • The same A/B approach can continue into scale as metrics are collected.
  • Testing surfaces hidden process problems: many workflows exist only in people's heads, with no documented standard operating procedure. AI cannot reliably assist an undocumented process.
  • Sometimes the result of test and learn is the decision not to use AI — that is a success, not a failure.

A failure that was actually a success

  • Goal: straight-through processing for a specific workflow, targeting high automation accuracy.
  • Spent ~six weeks (three two-week sprints) iterating on a small data set; reached 65% accuracy, then low 70s%, then stalled.
  • Could not move the needle further without breaking other constraints — decided to stop.
  • Outcome: team was upskilled, the learning cost less than one quarter, and the failure shaped a better approach to future problems.
  • Contrast: companies that spend two years and $10M to reach the same conclusion have a much harder story to tell.

The three elements of scale

  • Operationalise the technology — provision compute, put the machinery in place; this is the straightforward part.
  • Change leadership — the hardest and most variable element across organisations; AI amplifies normal change resistance because people fear job loss; requires assessing where individuals sit on the change curve and meeting them there.
  • Impact dashboard — near-real-time visibility into model performance for all stakeholders: business teams, technology teams, finance, legal, and risk/control functions monitoring bias and model drift.

Building a culture of AI exploration

  • Not all use cases should reach scale — the framework exists partly to weed out weaker ones efficiently.
  • High-value use cases (clear OpEx savings, revenue generation) are usually obvious; pursue them directly.
  • For the broader opportunity space, create a culture of exploration where teams closest to problems can run quick experiments.
  • Consider a dedicated innovation group not accountable for delivery, whose job is exploration and handoff when a winner is found.
  • Democratising experimentation across teams positions an organisation to capitalise on AI opportunities not yet visible from the top.

Advice for corporate AI executives

  • Ground every initiative in the basics: problem first, metrics second, test and learn, then scale.
  • Educating stakeholders is a core part of the job — large consulting firms are promising gains that do not match reality; closing that expectation gap is as important as the technology work itself.
  • The hype cycle means funding is easy now, but finance will come back in 12–18 months asking what value was delivered. The framework creates the evidence trail to answer that question.

More like this — when you're ready for early access.

Join the waitlist for a personal account and content recommendations based on what you're working on.

No spam. Unsubscribe at any time.

You're on the list. We'll be in touch before launch.

Get early access to the full library.

Join the waitlist for a personal account and content recommendations based on what you're working on.

No spam. Unsubscribe at any time.

You're on the list. We'll be in touch before launch.

Be among the first to get personalised recommendations tailored to your stage in business.

No spam.

You're on the list. We'll be in touch before launch.

Be among the first to get personalised recommendations tailored to your stage in business.

No spam.

You're on the list. We'll be in touch before launch.