How Casetext built a $650M AI legal product over ten years

Executive overview

Legal research at 2 a.m. was as hard as ordering takeout was easy — that gap was the founding insight. Casetext spent a decade building toward that fix, surviving false peaks and slow growth before GPT-4 unlocked a product that compressed days of legal work into minutes.

Domain expertise plus the right model at the right moment is the repeatable formula for AI product-market fit.

The ten-year overnight success

  • Founded 2013 as a crowdsourced case law library — Wikipedia meets Reddit for legal annotation
  • Early AI tools used rudimentary NLP; useful, but not transformative
  • First enterprise clients paid $50K–$150K each, creating false confidence in scale
  • Exhausted early-adopter law firms; pivoted to smaller firms, saw thousands of sign-ups, then that channel also plateaued
  • Maintained product velocity and customer feedback loops throughout every slow period
  • Accessed GPT-4 roughly six months before public release — the inflection point

What product-market fit actually feels like

  • Revenue added in millions per month, not thousands
  • Enterprise clients who previously needed 9–18 months to decide signed in under a month
  • Customers visibly lit up in ways no previous product had produced
  • On track to triple annual revenue within a year of launching Co-Counsel

The killer demo pattern

  • Co-Counsel uploads thousands or millions of documents and answers complex legal questions in minutes
  • Enron demo: AI flagged emails with sarcasm ("Kenneth Lay, an honest man") as potential fraud evidence
  • End of demo: 4–5 days of legal work done in 10–15 minutes
  • This "golden demo" pattern — immediate visible value, immediate large-dollar commitment — is repeating across top LLM startups

Hard engineering problems between model and product

  • Serving thousands of simultaneous users reviewing millions of pages requires significant infrastructure work
  • Hallucination prevention: ensuring AI does not misstate document contents
  • Regression testing: verifying that prompt or code changes don't introduce wrong outputs
  • Most of this tooling had to be built internally; none of it came free with the model

The LLM opportunity stack

  • Base layer: foundation models (GPT-4 and successors) — analogous to cloud computing
  • Middleware layer: tools like LangChain that help developers use models effectively
  • Application layer: domain-specific products built on top
  • Each layer has genuine, large business opportunity
  • Accurate, high-scale deployment is still the differentiating hard problem at the application layer

Real-world impact

  • California Innocence Project faces a four-year backlog reading case files for wrongly imprisoned people
  • LLM-based review could cut that backlog from four years to one month
  • Legal work sits at the intersection of scale, complexity, and life-or-death stakes — exactly where AI leverage is highest

More like this — when you're ready for early access.

Join the waitlist for a personal account and content recommendations based on what you're working on.

No spam. Unsubscribe at any time.

You're on the list. We'll be in touch before launch.

Get early access to the full library.

Join the waitlist for a personal account and content recommendations based on what you're working on.

No spam. Unsubscribe at any time.

You're on the list. We'll be in touch before launch.

Be among the first to get personalised recommendations tailored to your stage in business.

No spam.

You're on the list. We'll be in touch before launch.

Be among the first to get personalised recommendations tailored to your stage in business.

No spam.

You're on the list. We'll be in touch before launch.