How Mercor became the fastest-growing company in history by matching experts with AI labs

Executive overview

AI labs can improve models only as fast as they can measure what "good" looks like — and that measurement requires domain experts, not crowdsourced generalists. Mercor identified this bottleneck early and built a marketplace that sources, vets, and deploys high-caliber professionals to write evals and post-training data for the world's top AI labs.

The result: $1 to $400M revenue run rate in 16 months, zero customer churn, and net retention above 1,600%.

The core insight: if the model is the product, the eval is the PRD — and every capability gap needs a human expert to define what success looks like.

What evals are and why they matter

  • An eval is a systematic way to measure whether a model achieves a desired capability — equivalent to a product requirement document for AI.
  • Researchers run dozens of experiments daily against eval sets; once a good eval exists, reinforcement learning can climb it rapidly.
  • Evals serve double duty: internal benchmark for researchers and external sales collateral demonstrating model capabilities.
  • The shift from academic evals (Olympiad math, GPQA) to practical evals (legal redlining, investment analysis) is now underway.
  • For enterprises: the prerequisite to deploying AI across your value chain is defining how you measure success in that value chain.

What Mercor's experts actually do

  • Experts create rubrics — structured criteria for what correct model output looks like — used both to score outputs and to reward model trajectories in RL.
  • Example: a lawyer defines what an accurate contract redline looks like, point by point; the model then optimizes against that rubric.
  • This is post-training work, not pre-training data collection: it shapes reasoning and prioritization, not raw knowledge ingestion.
  • Data types range from supervised fine-tuning (input/output pairs) to RLHF preference labels to verifier rubrics for reinforcement learning from AI feedback (RLAIF).
  • The market is bounded by the set of things humans can do that models cannot — which remains large across legal, medical, creative, and engineering domains.
  • Tens of thousands of experts are active at any time; hundreds of thousands more broadly.
  • Median pay is $95/hour; specialized experts earn up to $500/hour.

The competitive landscape

  • Early data labeling (Scale, Surge) focused on volume of low-to-medium-skilled workers writing grammatically correct sentences for early LLMs.
  • The market shifted: labs now need sourcing and vetting of professionals — software engineers, bankers, doctors, lawyers, screenwriters.
  • Mercor's differentiation: a labor marketplace that retains expert dignity, pays well, and cuts out intermediaries; similar economics to Uber/DoorDash but for high-skilled talent.
  • The top 10% of experts on a project typically drive the majority of model improvement — identifying that cohort reliably is Mercor's core proprietary advantage.

How Mercor found product-market fit

  • Founded January 2023 at age 19; bootstrapped to $1M revenue run rate before dropping out of college.
  • Early signal: a meeting with the xAI co-founding team while still in college revealed that labs urgently needed quality-focused expert talent.
  • Inflection point: an incumbent crowdsourcing company used Mercor's platform to hire 1,000+ people and then failed to pay them — exposing incumbents' poor talent experience.
  • Response: started working directly with labs in May 2024; grew to $400M run rate in 16 months.
  • Key lesson: product-market fit looks like surprising ease of selling to the marginal customer, not grinding through objections.
  • Remain stubborn on thesis (how the world will change), but stay open on form (exactly how your company fits into it).

Building Mercor's culture

Three core values:

  1. Can-do attitude — set audacious revenue targets and build the trajectory around them (e.g., called $50M run rate from $1.5M; hit it within two weeks of the deadline).
  2. High standards — extreme patience hiring the first 10 people; half were former founders or executives; initial talent density shapes the entire org as it scales.
  3. Intensity — output-oriented, not hours-oriented; people who care deeply tend to work hard without mandated schedules.
  • In retrospect, scaling from 10 to 100 people could have moved faster once demand clearly exceeded capacity.
  • Sales and marketing headcount was near zero for the first 18 months; growth came entirely from word of mouth and customer obsession.

The future of evals and labor markets

  • Evals are evergreen: as long as there are capabilities to build, experts will be needed to define success criteria for them.
  • The entire economy may become an "RL environment machine" — every workflow converted into a measurable context for model improvement.
  • Super intelligence within three years is likely overstated; a 10-year road of post-training improvements is more realistic.
  • Post-training data is far more data-efficient than scaling pre-training; the frontier will be pushed by quality of evals, not volume of compute.

Skills and jobs that will last

  • Roles in industries with elastic demand — where 10x productivity creates 10x more work, not 10x fewer jobs — are best positioned.
  • Software development is the most elastic: demand for features is effectively unlimited.
  • Product management, operations, consulting, and creative work also fall in this category.
  • Accounting and similar fixed-volume domains are more exposed.
  • The differentiating skill across all fields: the ability to leverage AI to amplify existing domain expertise.
  • Assessment is shifting from "can you do the task?" to "what can you build in an hour using all available AI tools?"

Hiring advice for founders

  • Wait for genuine pull before stepping on the gas: if selling to the marginal customer is extremely hard, you haven't found the market yet.
  • First 10 hires: be patient and disciplined; this talent density compounds across the org.
  • Once demand clearly exceeds capacity: prioritize speed over selectivity.
  • Focus on people's strengths rather than improving weaknesses — a lesson Brendan attributes partly to being dyslexic and navigating around his own limitations.

More like this — when you're ready for early access.

Join the waitlist for a personal account and content recommendations based on what you're working on.

No spam. Unsubscribe at any time.

You're on the list. We'll be in touch before launch.

Get early access to the full library.

Join the waitlist for a personal account and content recommendations based on what you're working on.

No spam. Unsubscribe at any time.

You're on the list. We'll be in touch before launch.

Be among the first to get personalised recommendations tailored to your stage in business.

No spam.

You're on the list. We'll be in touch before launch.

Be among the first to get personalised recommendations tailored to your stage in business.

No spam.

You're on the list. We'll be in touch before launch.