Research

Engineering Metrics Mastery: DORA Benchmarks, Cycle Time, and What Elite Teams Actually Measure

Elite teams deploy on demand with <1 hour recovery; low performers take 1-6 months per release

Overview

Most engineering teams measure something. Few measure the right things. The 2024 DORA report surveyed 39,000+ professionals and found that elite performers recover from failures 2,293x faster than low performers. LinearB analyzed 8.1 million PRs across 4,800 teams and found cycle times ranging from under 25 hours (elite) to over 161 hours (bottom quartile). The gap is not talent. It is what you measure and how you act on it.

Key Findings

The Four DORA Metrics: 2024 Benchmarks

The DORA 2024 Accelerate State of DevOps Report (Google Cloud) clusters teams into four performance tiers. About 19% qualify as elite, 22% as high, 35% as medium, and 25% as low.

MetricEliteHighMediumLow
Deployment FrequencyOn demand (multiple/day)Daily to weeklyWeekly to monthlyMonthly to every 6 months
Lead Time for Changes<1 day1 day - 1 week1 week - 1 month1-6 months
Change Failure Rate~5-19%~20%~10%~40%
Time to Restore (MTTR)<1 hour<1 day<1 day1 week - 1 month

The multiplier gaps are staggering: elite teams are ~127x faster in lead time, deploy ~8x more frequently, and restore service ~2,293x faster than low performers (DORA 2024).

Cycle Time Breakdown: Where Hours Actually Go

LinearB’s 2026 Engineering Benchmarks Report dissects cycle time (first commit to production) into four phases. The data comes from 8.1 million PRs across 4,800 teams in 42 countries.

PhaseEliteGoodFairNeeds Focus
Total Cycle Time<25 hours25-72 hours73-161 hours>161 hours
Coding Time<54 minutes54 min - 4 hours5-23 hours>23 hours
PR Pickup Time<1 hour1-4 hours5-16 hours>16 hours
Review Time<3 hours3-14 hours15-24 hours>24 hours
Deploy Time<16 hours16-106 hours107-277 hours>277 hours

The single biggest lever: PR size. Elite teams average <100 lines changed per PR. Bottom-quartile teams average >228 lines. Small PRs drive faster reviews, lower rework, and lower change failure rates (LinearB 2026).

What Engineering Leaders Actually Measure

The LeadDev 2024 Engineering Team Performance Report surveyed 978 engineering leaders on which metrics they find most useful:

  • #1 Cycle time — top-ranked productivity metric by usefulness
  • #2 Lead time for changes — directly maps to DORA
  • #3 Deployment frequency — speed of delivery to production
  • Avoided metrics: Lines of code (70% avoid), story points (47% avoid), PRs closed (42% avoid) — all considered gameable

42% of leaders rate DORA metrics as “very effective” or “effective,” up from 34% the prior year. But one-third of leaders never report performance metrics at all (LeadDev 2024).

Quality and Efficiency Indicators Beyond DORA

MetricEliteGoodFairNeeds Focus
PR Size (lines)<100100-155156-228>228
Rework Rate<1%1-4%5-17%>17%
Merge Frequency (merges/dev/week)>21.5-21-1.5<1
Change Failure Rate<1%1-4%5-17%>17%

Teams with PRs under 200 lines achieve <2% rework rates. Teams with PRs over 793 lines see rework above 7% and change failure rates above 17% (LinearB 2026).

The Metrics That Matter for Business

LeadDev found that engineering leaders rank business impact metrics alongside operational ones:

  • Code quality tracked by 54% of leaders
  • Customer complaints tracked by 45%
  • Team autonomy tracked by 34%
  • Top strategic metrics: user satisfaction, user growth, ROI, meeting SLOs (LeadDev 2024)

The reason teams measure: 45% to increase velocity (up from 37% prior year), 18% to identify bottlenecks, 16% for accountability (LeadDev 2024).

The AI Impact on Metrics (2025 Data)

The DORA 2025 report found that 95% of developers now use AI coding tools, but organizational results are mixed:

  • Individual output: +21% tasks completed, +98% more PRs merged
  • But at org level: review time +91%, PR sizes +154%, bug rates +9%
  • 75% of organizations see no delivery improvement at team level (DORA 2025)
  • AI adopters saw median cycle time drop 24% (from 16.7 hours to 12.7 hours) in controlled environments (Jellyfish 2025)

What This Means for Your Team

  • Start with cycle time, not all 21 metrics. Cycle time is the #1 metric by usefulness among 978 engineering leaders (LeadDev 2024). It captures coding, review, and deploy in one number. Get this under 72 hours before adding complexity.
  • Benchmark against real data, not gut feel. Elite cycle time is <25 hours. If yours is >161 hours, you are in the bottom quartile of 4,800 teams (LinearB 2026). Know your tier before setting targets.
  • Fix PR size first — it unlocks everything else. Teams under 100 lines per PR achieve elite-tier performance across cycle time, review time, and rework rate. No tooling investment compensates for 793-line PRs.
  • Stop measuring lines of code and story points. 70% and 47% of engineering leaders respectively avoid these metrics because they incentivize output over outcomes (LeadDev 2024). Measure flow, not volume.
  • Watch for the AI metrics trap. AI tools boost individual output by 21-98%, but inflate review time by 91% and PR size by 154% at the org level (DORA 2025). Pair AI adoption with strict process discipline or your metrics will mislead you.

Sources

  • DORA 2024 Accelerate State of DevOps Report (Google Cloud, 39,000+ cumulative respondents)
  • DORA 2025 State of AI-Assisted Software Development Report (~5,000 developers)
  • LinearB 2026 Engineering Benchmarks Report (8.1M PRs, 4,800 teams, 42 countries)
  • LinearB Community Benchmarks (3.7M PRs, 2,022 organizations)
  • LeadDev 2024 Engineering Team Performance Report (978 engineering leaders)
  • Jellyfish 2025 AI Metrics in Review
  • DX.ai State of Developer Experience 2024 (2,100+ developers and leaders)