How to measure and improve developer productivity with Nicole Forsgren

Executive overview

Most teams want to move faster but can't agree on what that means. The root problem — identified in 80% of engagements — is failing to define the goal before starting any measurement.

Two research-backed frameworks cut through this: DORA measures software delivery performance with four metrics; SPACE provides a structured way to pick balanced metrics for any complex creative work. Together they form a complete approach to measuring, diagnosing, and improving developer productivity.

Speed and stability are not a trade-off — they move in the same direction. Shipping smaller changes more often produces more stable systems.

The DORA four metrics

Lead time for changes: time from code commit to production
Deployment frequency: how often code is deployed
Mean time to restore (MTTR): how long recovery takes after an incident
Change fail rate: percentage of changes that cause incidents requiring intervention

Elite performer benchmarks (2019, remain largely consistent):

Deployment frequency: on-demand
Lead time: less than a day
MTTR: less than an hour
Change fail rate: 0–15%

Why speed and stability move together

Frequent deployments force smaller batch sizes, which shrinks blast radius
Smaller changes are easier to debug and faster to roll back
Infrequent deployments cause large batch merges — more merge conflicts, harder root-cause analysis
The old ITIL assumption that two-week change freezes improve stability is wrong; they cause it
Company size has no statistically significant effect on these metrics — the benchmarks apply to startups and enterprises equally

The SPACE framework

SPACE provides five dimensions for picking balanced productivity metrics. Use at least three dimensions at once to avoid gaming any single signal.

S — Satisfaction and well-being: survey-based; correlates strongly with all other dimensions; early warning signal when things degrade
P — Performance: outcome of a process (e.g. reliability, change fail rate)
A — Activity: count-based metrics instrumented from systems (PRs, commits, deployments)
C — Communication and collaboration: how people and systems interact; includes code searchability, meeting load
E — Efficiency and flow: time through the system; number of hops a ticket takes; uninterrupted coding time

Dora is an implementation of SPACE focused on the outer loop (code commit to production). SPACE is the meta-framework for choosing metrics when you want to measure any specific capability.

Picking the right metrics

Never pick only activity metrics (commits, PRs, lines of code) — they create perverse incentives
Always pair metrics in tension: e.g. alert frequency (activity) vs. uninterrupted coding time (efficiency)
Add a satisfaction metric to surface what instrumentation cannot show
Run surveys periodically (every few months), not continuously
System data and people data are complements — each reveals blind spots the other misses
Version control data will never reveal code that isn't being committed; only surveys will

Common pitfalls

Starting without a clearly written problem statement — teams run for months then discover misalignment
Pursuing measurement only top-down or only bottom-up; both directions are needed
Picking all metrics from one SPACE dimension (usually activity)
Treating DORA benchmarks as the goal rather than a starting point — progress matters more than tier
Assuming AI tools mean you need fewer engineers: AI shifts time from writing to reviewing code, freeing cognitive capacity for harder problems, not halving headcount

How to start from nothing

Write down the specific problem or goal — be precise about whether it's friction, culture, tooling, or something else
Check whether any existing data or signal is already available
If no data exists, interview a handful of developers: what are the biggest barriers to their productivity?
Use the DORA quick check at dora.dev to benchmark current state and identify likely constraints for your industry and performance profile
Use SPACE to select balanced metrics once you know what you want to improve
Start heavy on survey data; shift toward instrumented system data as measurement matures

The four-box framework for measurement hypotheses

A structured approach to connecting a hypothesis to the data used to test it.

Draw two rows and two columns. The top row is labeled words; the bottom row is data. An arrow connects the left box to the right box in each row.

Top-left: the concept you believe is the cause (e.g. "customer satisfaction")
Top-right: the outcome you expect (e.g. "return customers")
Bottom-left: the data proxy for the cause (e.g. CSAT score, NPS)
Bottom-right: the data proxy for the outcome (e.g. return visits, referral links)

Why it works:

Forces clarity on what is actually being measured before touching data
Separates disagreements about the hypothesis (top row) from disagreements about the data (bottom row)
Prevents spurious correlations from being mistaken for causal relationships
Advanced mode: start from available data and work upward to articulate what relationship the data actually represents, then validate with interviews

AI and developer productivity

AI coding tools shift the ratio of time spent: roughly 50% of developer time is now reviewing AI-generated code rather than writing from scratch
Specific tasks (e.g. building an HTTP server) can be completed ~50% faster, but this is not the right productivity frame
The real benefit is cognitive offload that frees capacity for harder, more novel work
AI is changing the friction model, cognitive load expectations, and reliance patterns — existing metrics need revisiting
SPACE will likely need a new dimension around trust and over-reliance on AI-generated output
Impact on novices versus experts is an open and important research question

How to measure and improve developer productivity with Nicole Forsgren

Executive overview

The DORA four metrics

Why speed and stability move together

The SPACE framework

Picking the right metrics

Common pitfalls

How to start from nothing

The four-box framework for measurement hypotheses

AI and developer productivity

More like this — when you're ready for early access.

Get early access to the full library.

Be among the first to get personalised recommendations tailored to your stage in business.

Be among the first to get personalised recommendations tailored to your stage in business.

Executive overview

The DORA four metrics

Why speed and stability move together

The SPACE framework

Picking the right metrics

Common pitfalls

How to start from nothing

The four-box framework for measurement hypotheses

AI and developer productivity

More like this — when you're ready for early access.

More in Operations

How to schedule AI tasks in Claude Cowork and Codex

Systems, processes, and SOPs: what they are and how they connect

How to systemize your business when you have no time

Get early access to the full library.

Be among the first to get personalised recommendations tailored to your stage in business.

Be among the first to get personalised recommendations tailored to your stage in business.