Skip to main content

Engineering Velocity: A Complete Guide to DORA Metrics and Measuring Team Performance

Elena Rodriguez
12 min read
Engineering Velocity: A Complete Guide to DORA Metrics and Measuring Team Performance

How do you measure the productivity of a software team? If you answer "lines of code" or "hours worked," you're optimizing for the wrong things. These metrics encourage gaming, don't correlate with value delivery, and destroy morale.

High-performing technology organizations focus on outcomes, not output. The industry standard for measuring software delivery performance is the DORA (DevOps Research and Assessment) framework, backed by years of research across thousands of organizations.

Why Traditional Metrics Fail

Let's examine why common productivity metrics are counterproductive:

MetricWhy It's UsedWhy It Fails
Lines of CodeEasy to measureIncentivizes verbose code; 100 lines could be 10
Hours WorkedVisible effortBurnout; presence ≠ productivity
Story Points CompletedTracks velocityPoint inflation; estimates ≠ value
Bugs FixedShows activityIncentivizes creating bugs to fix
PRs MergedShows outputEncourages small, trivial PRs

The fundamental problem: these metrics measure activity, not impact.

Introduction to DORA Metrics

DORA metrics emerged from six years of research by DevOps Research and Assessment (now part of Google Cloud). The research surveyed over 32,000 professionals worldwide and identified four key metrics that predict software delivery performance AND organizational performance.

┌─────────────────────────────────────────────────────────────┐
│                   DORA Framework                             │
├─────────────────────────────────────────────────────────────┤
│                                                              │
│   Throughput Metrics          Stability Metrics             │
│   (Speed of Delivery)         (Quality of Delivery)         │
│                                                              │
│   ┌─────────────────┐         ┌─────────────────┐           │
│   │   Deployment    │         │  Change Failure │           │
│   │   Frequency     │         │     Rate        │           │
│   │                 │         │                 │           │
│   │ "How often do   │         │ "What % of      │           │
│   │  we deploy?"    │         │  changes fail?" │           │
│   └─────────────────┘         └─────────────────┘           │
│                                                              │
│   ┌─────────────────┐         ┌─────────────────┐           │
│   │   Lead Time     │         │   Mean Time to  │           │
│   │   for Changes   │         │   Restore (MTTR)│           │
│   │                 │         │                 │           │
│   │ "How long from  │         │ "How fast do    │           │
│   │  code to prod?" │         │  we recover?"   │           │
│   └─────────────────┘         └─────────────────┘           │
│                                                              │
└─────────────────────────────────────────────────────────────┘

The Four DORA Metrics Explained

1. Deployment Frequency

Question: How often does your organization successfully release to production?

What it measures: The speed at which your team can deliver value to customers.

Performance LevelFrequency
🏆 EliteOn-demand (multiple times per day)
🥇 HighBetween once per day and once per week
🥈 MediumBetween once per week and once per month
🥉 LowBetween once per month and once every six months

Why it matters:

  • Smaller, more frequent batches reduce risk
  • Faster feedback loops
  • Earlier value delivery to customers
  • Easier troubleshooting when issues arise

How to improve:

  • Automate testing and deployment
  • Break features into smaller increments
  • Implement feature flags for safe releases
  • Reduce batch sizes

2. Lead Time for Changes

Question: How long does it take for a commit to go from code to production?

What it measures: The efficiency of your entire software delivery pipeline.

Performance LevelLead Time
🏆 EliteLess than one hour
🥇 HighBetween one day and one week
🥈 MediumBetween one week and one month
🥉 LowBetween one month and six months

Why it matters:

  • Low lead time indicates a healthy, automated pipeline
  • Faster response to customer needs
  • Quick bug fixes and security patches
  • Higher developer satisfaction

Components of Lead Time:

Lead Time for Changes

Code ──► Review ──► Build ──► Test ──► Deploy ──► Production
  │        │         │         │         │
  └────────┴─────────┴─────────┴─────────┘
           Total Lead Time

Typical breakdown:
┌──────────────────────────────────────────────────────────┐
│ Code Review       │████████████│ 40%  ← Often bottleneck │
│ Build/Compile     │███│ 10%                              │
│ Automated Testing │██████│ 20%                           │
│ Manual QA         │██████│ 20%  ← Eliminate if possible  │
│ Deployment        │███│ 10%                              │
└──────────────────────────────────────────────────────────┘

How to improve:

  • Automate everything possible
  • Parallelize test suites
  • Implement trunk-based development
  • Reduce PR review wait times

3. Change Failure Rate (CFR)

Question: What percentage of deployments cause a failure in production?

What it measures: The quality of your changes and the robustness of your testing.

Performance LevelFailure Rate
🏆 Elite0-5%
🥇 High5-10%
🥈 Medium10-15%
🥉 Low16-30%+

What counts as a failure:

  • Production incidents requiring remediation
  • Rollbacks
  • Hotfixes
  • Failed deployments
  • Degraded service requiring intervention

Why it matters:

  • Speed means nothing if you're breaking things constantly
  • Customer trust and satisfaction
  • Team morale and on-call burden
  • Regulatory compliance

How to improve:

  • Comprehensive automated testing
  • Feature flags for canary releases
  • Better pre-production environments
  • Post-incident reviews and learning

4. Mean Time to Restore (MTTR)

Question: How long does it take to restore service when a failure occurs?

What it measures: Your team's ability to detect, diagnose, and recover from incidents.

Performance LevelTime to Restore
🏆 EliteLess than one hour
🥇 HighLess than one day
🥈 MediumBetween one day and one week
🥉 LowMore than one week

Why it matters:

  • Failure is inevitable; recovery speed is what matters
  • Minimizes customer impact
  • Reduces stress on teams
  • Shows operational maturity

MTTR Breakdown:

Incident Timeline

Incident ──► Detection ──► Triage ──► Fix ──► Deploy ──► Verify
Starts         │            │         │        │          │
               └────────────┴─────────┴────────┴──────────┘
                            Total MTTR

Key Components:
• Mean Time to Detect (MTTD)  ← Monitoring quality
• Mean Time to Acknowledge    ← On-call processes
• Mean Time to Diagnose       ← Observability, documentation
• Mean Time to Fix            ← System complexity
• Mean Time to Deploy         ← Pipeline speed
• Mean Time to Verify         ← Testing confidence

How to improve:

  • Invest in observability (monitoring, logging, tracing)
  • Create runbooks and incident playbooks
  • Practice incident response (game days)
  • Implement easy rollback mechanisms

DORA Performance Benchmarks

Based on the 2023 State of DevOps Report:

                        Elite      High       Medium     Low
                        ─────      ────       ──────     ───
Deployment Frequency    Multiple   Daily to   Weekly to  Monthly to
                        per day    weekly     monthly    semi-annual

Lead Time for Changes   < 1 hour   1 day to   1 week to  1-6 months
                                   1 week     1 month

Change Failure Rate     0-5%       5-10%      10-15%     16-30%

Mean Time to Restore    < 1 hour   < 1 day    1 day to   > 1 week
                                              1 week

Key insight: Elite performers are NOT trading speed for stability. They deploy more frequently AND have lower failure rates.

Implementing Metrics Without Toxicity

Metrics can be weaponized. If you punish teams for high Change Failure Rates, they will stop deploying. If you reward Deployment Frequency, they will deploy empty commits. Here's how to implement metrics constructively:

Principle 1: Measure Teams, Not Individuals

Software is a team sport. Individual metrics create:

  • Competition instead of collaboration
  • Incentives to game the system
  • Blame culture

Instead: Measure at the team or product level. Celebrate team improvements.

A single snapshot tells you nothing. Focus on:

  • Are we better than we were last month?
  • Is the trend improving?
  • What changed to cause improvement or decline?
Good Dashboard Example:

Lead Time Trend (Last 6 Months)
───────────────────────────────
12h │
10h │ ■
 8h │ ■ ■
 6h │     ■ ■
 4h │         ■ ■ ■            ← Improving!
 2h │
    └───────────────────────
      J   F   M   A   M   J

Principle 3: Context Matters

Not all teams are the same:

  • A platform team rewriting a core database engine should have a lower deployment frequency than a frontend team tweaking UI text
  • A greenfield project will have different metrics than legacy maintenance
  • Different domains have different risk profiles

Instead: Compare teams to their own history, not to each other.

Principle 4: Focus on Removing Friction

Use metrics as a compass, not a report card:

If This Metric is PoorInvestigate These Areas
Low Deployment FrequencyManual processes, fear of change, large batch sizes
High Lead TimeSlow builds, review bottlenecks, manual testing
High Change Failure RateInadequate testing, poor environments, rushed changes
High MTTRPoor observability, missing runbooks, complex systems

Anti-Patterns to Avoid

Goodhart's Law: "When a measure becomes a target, it ceases to be a good measure."

Anti-PatternWhat HappensBetter Approach
Targeting Deployment FrequencyEmpty deploys, split PRs artificiallyFocus on reducing batch size
Punishing CFRTeams stop deploying, hide incidentsBlameless post-mortems
Gamifying metricsCompetition, gaming, burnoutTeam-level improvement focus
Public shamingFear, hiding problemsPrivate team discussions

Building a DORA Dashboard

Here's what a healthy DORA dashboard looks like:

┌─────────────────────────────────────────────────────────────┐
│                  Engineering Metrics Dashboard               │
├─────────────────────────────────────────────────────────────┤
│                                                              │
│  Team: Platform Engineering      Period: Last 30 Days       │
│                                                              │
│  ┌─────────────────┐  ┌─────────────────┐                   │
│  │ Deployment Freq │  │ Lead Time       │                   │
│  │     23/month    │  │    4.2 hours    │                   │
│  │     ↑ 15%       │  │     ↓ 22%       │                   │
│  │   [■■■■□] High  │  │   [■■■■■] Elite │                   │
│  └─────────────────┘  └─────────────────┘                   │
│                                                              │
│  ┌─────────────────┐  ┌─────────────────┐                   │
│  │ Change Failure  │  │ MTTR            │                   │
│  │      8.7%       │  │    45 minutes   │                   │
│  │     ↓ 3%        │  │     ↓ 15%       │                   │
│  │   [■■■■□] High  │  │   [■■■■■] Elite │                   │
│  └─────────────────┘  └─────────────────┘                   │
│                                                              │
│  Trend: Overall improvement. CFR spike in Week 2            │
│  due to database migration - addressed in post-mortem.      │
│                                                              │
└─────────────────────────────────────────────────────────────┘

Tools for Measuring DORA

ToolTypeBest For
SleuthCommercialComplete DORA tracking
LinearBCommercialEngineering metrics
JellyfishCommercialEngineering intelligence
Four KeysOpen SourceGoogle's reference implementation
HaystackOpen SourceGitHub-focused
Custom + DataDogDIYFlexible, integrated

Beyond DORA: Additional Metrics

While DORA covers software delivery, consider these complementary metrics:

Developer Experience Metrics

MetricWhat It Measures
Developer SatisfactionSurvey-based team health
Onboarding TimeTime for new developers to ship
Context SwitchingInterruption frequency
Technical Debt RatioTime spent on maintenance

Quality Metrics

MetricWhat It Measures
Test CoverageBreadth of automated testing
Escaped BugsBugs found in production
Code Review TurnaroundTime to review PRs
Security VulnerabilitiesOpen security issues

Implementation Roadmap

Month 1: Foundation

  • Set up deployment tracking
  • Instrument CI/CD for lead time
  • Define what constitutes a "failure"
  • Establish incident tracking

Month 2: Visibility

  • Create team dashboard
  • Establish baseline measurements
  • Share with teams (not management first)
  • Identify obvious improvement areas

Month 3-6: Improvement

  • Focus on one metric at a time
  • Run experiments to improve
  • Document what works
  • Celebrate improvements

Ongoing

  • Regular retrospectives on metrics
  • Adjust targets as team improves
  • Resist pressure to use for performance reviews
  • Keep focus on team improvement

Key Takeaways

  1. Measure outcomes, not output: Lines of code and hours worked don't correlate with value
  2. Use DORA metrics: Deployment Frequency, Lead Time, CFR, and MTTR are research-backed
  3. Elite teams are fast AND stable: Speed and quality are not trade-offs
  4. Measure teams, not individuals: Software is a team sport
  5. Trends over absolutes: Compare teams to their own history
  6. Context matters: Different teams have different constraints
  7. Remove friction: Use metrics as a compass to find bottlenecks
  8. Avoid weaponization: Metrics as punishment destroy trust and performance

Focus on removing friction. If Lead Time is high, invest in faster builds. If MTTR is high, invest in better monitoring. The metrics are a compass, not a report card.


Want to improve your engineering team's performance with data-driven insights? Contact EGI Consulting for an engineering metrics assessment and improvement roadmap tailored to your organization.

Related articles

Keep reading with a few hand-picked posts based on similar topics.

Posted in Blog & Insights