The original is one click away. Open original ↗
How to build and sell an AI startup that replaces professional work
Executive overview
Most founders waste time debating what people want. The answer is already visible: look at who people pay to do work today, and replace or assist that job with AI.
Jake Heller built CoCounsel, the first AI assistant for lawyers, sold it to Thomson Reuters for $650M, and distilled a three-part framework: pick the right job to target, build it with rigorous evaluations, and sell it as a service — not a SaaS seat.
The unfair advantage is evaluations: most teams stop at 60% accuracy and call it impossible; the ones who spend two weeks on a single prompt reach 97%.
Choosing what to build
- The question "what do people want?" just got easy: it's whatever people currently pay other humans to do.
- Three categories of AI opportunity:
- Assistance — help professionals accomplish tasks faster (e.g. legal research, contract review)
- Replacement — take over the job entirely (e.g. an AI-powered law firm)
- Previously unthinkable — tasks no one would staff because the cost was absurd (e.g. reading every document in a corpus)
- Total addressable market has shifted from "number of software seats × $20/month" to "combined salaries of everyone currently doing this work" — a 100–1,000× step up.
- Good markets to target: roles already being outsourced, widespread pain across many companies, and areas where you can acquire real domain knowledge.
- Ignore competitors. Markets are large enough for multiple winners, and once you build, you'll find most competitors are far weaker than they look.
How to build reliable AI
- Start with domain expertise: understand in granular detail what an expert actually does, step by step. Don't guess.
- Ask: how would the best person in this field approach this task with unlimited time and resources?
- Work backwards to discrete steps. Each step becomes either a prompt or a deterministic piece of code.
- If a task follows the same steps every time, build a simple workflow — no complex frameworks needed, just chained functions.
- If the approach must adapt to circumstances, move toward an agentic design — harder to make reliable, but sometimes necessary.
- Use deterministic code over tokens wherever possible; tokens are slow and expensive.
Getting to production quality with evaluations
- Most teams ship at 60–70% accuracy and fail in production. The fix is evaluations.
- Define what "good" looks like for each micro-task and for the overall output. This requires domain knowledge.
- Prefer objectively gradable answers (true/false, 0–7 scale) so evals can be automated.
- Build a test suite: start with a dozen cases, grow to 50, then 100. Use frameworks like PromptFoo.
- Hold back a test set and don't tune against it — avoid overfitting prompts to your evals.
- Expect slow progress: 60% → 61% → ... → 97%. Teams that give up at 61% are leaving a working product on the table.
- Once prompts are stable, AI failures become predictable — you can add instructions and examples to close specific error patterns.
- After beta launch, every customer complaint is a new eval. Real users do unexpected things; capture and add those cases.
- New models ship regularly — re-run evals on each one, as a single-word prompt change can move accuracy by a meaningful percentage.
Marketing and selling AI services
- Great product is the foundation. Word of mouth and inbound press are free marketing; a mediocre product makes all sales efforts expensive.
- Stop thinking in terms of SaaS seats. You may be selling a service — charge for the outcome, not access to the tool.
- Example: contract review that costs $1,000 at a law firm → charge $500, priced against the value delivered, not a monthly seat.
- Ask customers how they want to pay. Casetext found customers preferred predictable annual per-seat pricing ($6,000/seat/year) over usage-based billing, even if usage-based might cost less.
- Trust gap: buyers are used to managing humans. Bridge it with head-to-head pilots — let them run your AI alongside their existing service and compare.
- The sale does not end at contract signature. Many startups are currently sitting on pilot revenue that will not convert.
- Invest in onboarding and adoption: forward-deployed engineers, in-app walkthroughs, hands-on training. The product is not just the interface — it includes every human touchpoint.
What to focus on at each stage
- At every stage — seed through Series C — the answer is the same: build a product that achieves product-market fit.
- Most other priorities (culture, hiring, fundraising) are means to that end, not ends in themselves. Founders who lose sight of this stall.
- Once you have a genuinely great product, recruiting, sales, and marketing become much easier problems.
On defensibility and pricing
- Defensibility in AI comes from execution depth, not proprietary models: thousands of fine-tuned prompt decisions, data integrations, edge-case handling, and model selection choices that took years to accumulate.
- For jobs AI is replacing: price initially at what the human service costs. Competition will drive prices down over time — which is good for society.
- For entirely new AI-enabled capabilities: start from value delivered, take a percentage of that, and negotiate with customers.
More like this — when you're ready for early access.
Join the waitlist for a personal account and content recommendations based on what you're working on.
No spam. Unsubscribe at any time.
You're on the list. We'll be in touch before launch.