Nobody Budgets for Technical Debt. Everyone Pays for It.
A fintech company with 180 engineers spent 14 months and $4.2 million rebuilding their payment processing system. Not because the old system did not work. It worked fine. It processed millions of transactions a day without incident.
They rebuilt it because adding a new payment method to the old system took six weeks of engineering time. Their competitor added the same payment method in four days. In the time it took to release one feature, they lost three enterprise contracts to a competitor that could move faster.
The old system was not broken. It was expensive. And the cost was not showing up anywhere on their financial statements.
Technical debt is the most expensive line item in most engineering organizations that does not appear on any balance sheet. It shows up instead as velocity that mysteriously slows down, engineers who leave because they are tired of working in a codebase that feels like wading through concrete, features that take three times longer than estimated, and incidents that cascade in ways that nobody fully understands.
This analysis puts numbers on what most organizations treat as a feeling. The data is uncomfortable. The conclusions are actionable.
💡 The core argument: Technical debt is not a technical problem. It is a financial problem with a technical cause. Organizations that treat it as the former manage it poorly. Organizations that treat it as the latter manage it with the same rigor they apply to any other significant business liability.
What Technical Debt Actually Is { And What It Is Not }
Ward Cunningham coined the term technical debt in 1992 as a metaphor for the cost of making deliberate short-term engineering trade-offs that would need to be addressed later. The metaphor was precise: like financial debt, technical debt accrues interest over time, and that interest compounds.
The metaphor has since been stretched to cover almost any engineering problem anyone wants to prioritize, which has made it nearly useless as an organizational concept. Before examining the cost, the definition needs to be precise.
Technical debt is:
Deliberate shortcuts taken to ship faster that create future rework. Architectural decisions that were correct at one scale but are expensive at another. Code that works but is poorly understood, poorly documented, or structured in ways that make change expensive. Tests that were skipped in the name of speed. Dependencies that have not been updated in years. Interfaces between systems that were designed for yesterday's requirements and now require workarounds for every new use case.
Technical debt is not:
Bugs. Bugs are defects, not debt. A poorly designed system that also has bugs has both debt and defects — they compound each other but they are distinct problems. Legacy code that still meets requirements is not automatically debt. Old is not the same as expensive to change. Technical debt is specifically about the cost of future change, not the age or aesthetic quality of code.
The four categories that account for most organizational technical debt:
Architectural debt is the most expensive category. It lives in the decisions about how systems relate to each other — coupling between services, data model design, API contracts that have proliferated beyond their original scope. Architectural debt is expensive because it cannot be paid down incrementally. Changing the fundamental structure of how systems interact typically requires coordinated work across multiple teams over extended periods.
Code quality debt is the most visible category. Duplicated code, deeply nested logic, functions that do too many things, classes that have grown to thousands of lines, variable names that provide no semantic information. Code quality debt is expensive because it makes every change slower and every bug harder to find, but it can be paid down incrementally as teams touch affected areas.
Test coverage debt is the most dangerous category for velocity. When critical code paths lack automated tests, every change to that code requires manual verification. The cost is not the missing test — it is the compounding cost of every engineer who works in that area having to manually verify behavior that could be verified automatically in seconds.
Dependency debt is the most neglected category. Outdated dependencies introduce security vulnerabilities, incompatibilities with newer tooling, and eventually the cliff edge of a dependency that is end-of-lifed and can no longer be updated incrementally.
🔑 The categorization insight: Organizations that track technical debt as a single undifferentiated mass cannot prioritize it effectively. The four categories have different costs, different remediation strategies, and different risk profiles. Treating them as one problem is like treating all financial liabilities as equivalent regardless of interest rate or maturity date.
The Data: What Technical Debt Actually Costs
The challenge with quantifying technical debt is that its costs are mostly indirect and mostly invisible in standard reporting. The research that has been done produces figures that are consistently larger than most engineering leadership teams expect.
The CAST Research findings:
CAST Research analyzed the technical debt of 1,316 applications across multiple industries and found an average technical debt density of $3.61 per line of code. For a medium-sized enterprise application of 1 million lines of code, that represents $3.61 million in accumulated technical debt. For organizations with 10 to 50 million lines of code across their portfolio, the numbers become material at the balance sheet level.
The McKinsey analysis:
A McKinsey study found that on average, technical debt accounts for 20 to 40 percent of the value of a typical company's entire technology estate before addressing the debt. More directly applicable to engineering organizations: the study found that companies spend 10 to 20 percent of their technology budget on issues related to technical debt. At a $50 million annual technology budget, that is $5 to $10 million per year spent on debt service — work that delivers no new value.
The Stripe developer survey:
Stripe's 2018 developer survey of 850 developers estimated that developers spend 33 percent of their time dealing with technical debt — debugging legacy code, working around poor architectural decisions, and reworking code that should have been written differently the first time. At an average fully loaded cost of $200,000 per engineer per year in a major technology market, a 100-engineer organization is spending $6.6 million per year on technical debt service.
The compound interest effect — the data that most organizations miss:
The relationship between technical debt and development velocity is not linear. Research by the Software Engineering Institute found that velocity degradation accelerates as debt accumulates. A codebase with moderate debt slows velocity by roughly 15 to 25 percent compared to a clean codebase. A codebase with heavy debt — the kind that accumulates after years of deadline-driven shortcuts — slows velocity by 40 to 70 percent.
Debt Level | Velocity Impact | Incident Rate Increase | Engineer Turnover Risk |
|---|---|---|---|
Low — under 10% of codebase affected | 5 to 15% slower | Minimal | Low |
Moderate — 10 to 30% affected | 15 to 30% slower | 20 to 40% higher | Medium |
High — 30 to 60% affected | 30 to 50% slower | 40 to 80% higher | High |
Critical — over 60% affected | 50 to 75% slower | 80 to 150% higher | Very High |
⚠️ The velocity measurement challenge: Most organizations do not measure velocity in ways that make debt costs visible. Story points completed per sprint, features shipped per quarter, and lines of code written are all metrics that hide debt costs because they measure output without accounting for the increasing cost of producing that output. An organization shipping 20 features per quarter from a clean codebase and an organization shipping 20 features per quarter from a heavily indebted codebase are not in the same position — but their velocity metrics look identical.
The Hidden Costs That Do Not Appear in Technical Debt Estimates
Every technical debt estimate underestimates the true cost because it captures only the direct remediation cost — the engineering hours required to fix the debt. The indirect costs are typically larger and almost never measured.
The recruitment and retention cost:
Engineers leave organizations where they cannot do their best work. Codebases with heavy technical debt are environments where talented engineers feel their skills are being wasted. The 2023 Stack Overflow Developer Survey found that working with legacy code and technical debt was among the top five factors contributing to developer dissatisfaction and job search activity.
The cost of replacing a senior engineer is estimated at 1.5 to 2x their annual salary when recruitment costs, onboarding time, productivity ramp, and institutional knowledge loss are included. For a 100-engineer organization losing 15 percent of its senior engineers per year — a realistic turnover rate in a high-debt environment — the annual turnover cost attributable to technical debt runs into the millions before a single line of code has been touched.
The incident cost:
The relationship between technical debt and production incidents is well-documented. Systems with heavy architectural and code quality debt have more incidents, incidents that are harder to diagnose, and incidents that cascade more widely when they occur. The direct cost of an incident — engineer time spent resolving it, customer support volume, SLA penalties — is typically the smallest component. The indirect costs include customer churn, reputation damage, and the productivity loss of every engineer who context-switched into incident response.
The opportunity cost:
This is the largest component and the hardest to measure. When your team spends 33 percent of their time on debt-related work, that time is not available for product development. The question is not just what the debt service costs — it is what those engineering hours would have produced if the debt did not exist.
For a product company where a new feature can generate $500,000 in annual recurring revenue, every 10 features delayed by one quarter due to debt-slowed velocity represents $1.25 million in delayed revenue. Not lost — delayed. But in competitive markets, delayed revenue often becomes lost revenue when customers find an alternative that ships faster.
The compounding problem:
Technical debt compounds. A system that is 30 percent slower to change becomes 45 percent slower as new code written in that environment inherits the patterns of the existing code. Shortcuts taken because "we are already in a hurry" accumulate faster in high-debt codebases because the psychological environment normalizes shortcuts. The debt that is not addressed does not stay the same size — it grows.
💡 The compounding calculation:
Year 0: 100 engineers, 20% time on debt service = 80% productive capacity
Year 1: Debt grows 15% if not addressed
100 engineers, 23% time on debt service = 77% productive capacity
Year 2: Debt grows another 15%
100 engineers, 26% time on debt service = 74% productive capacity
Year 3: Debt grows another 15%
100 engineers, 30% time on debt service = 70% productive capacity
The organization has effectively lost the equivalent of 30 engineers
without adding a single person or eliminating a single position.
The headcount stayed flat. The output did not.How to Measure Technical Debt in Your Organization
Most organizations that claim to be managing technical debt are actually just acknowledging its existence. Measurement requires specific metrics that make the cost visible in terms that translate to business impact.
The four measurement dimensions:
Delivery lead time by system area.
The time from a developer starting work on a change to that change being deployed to production is one of the most sensitive indicators of technical debt. High-debt areas of a codebase have consistently longer lead times because changes require more investigation, more careful testing, and more coordination with other teams to avoid unexpected breakage.
Tracking lead time by system area over time reveals which parts of the codebase are accumulating debt and at what rate. A module where lead time has increased from 3 days to 12 days over six months is telling you something specific and actionable about the cost of work in that area.
Change failure rate by system area.
The percentage of deployments that result in incidents, rollbacks, or hotfixes is a direct measure of the quality and testability of a codebase. High-debt areas fail more often because the code is harder to reason about, the tests are inadequate to catch regressions, and the coupling between components means changes have unexpected side effects.
DORA research consistently shows that elite engineering organizations have change failure rates below 5 percent. Organizations with heavy technical debt typically see rates of 15 to 30 percent in affected system areas.
Time spent on unplanned work.
The ratio of planned work to unplanned work — bug fixes, incident response, performance firefighting, emergency patches — is one of the clearest indicators of technical debt's operational impact. A healthy engineering organization spends 15 to 20 percent of capacity on unplanned work. An organization with significant technical debt typically spends 35 to 50 percent on unplanned work, leaving less than half of engineering capacity available for intentional product development.
Engineer-perceived debt severity.
Quarterly surveys that ask engineers to rate the debt severity in their area of the codebase, the impact on their daily productivity, and their confidence in the stability of the systems they own are a leading indicator that predicts the lagging metrics above. Engineers know where the debt is before the metrics show it. Capturing that knowledge systematically makes it actionable before it becomes a crisis.
Sample quarterly debt survey questions:
1. How much of your time this quarter was spent working around
existing technical debt rather than building new capabilities?
(0 to 25%) (25 to 50%) (50 to 75%) (over 75%)
2. How confident are you in making changes to the systems you own
without causing unexpected breakage?
(Very confident) (Somewhat confident) (Not confident) (Avoid changes when possible)
3. If you could address one technical debt item in your area
in the next quarter, what would it be and what would it enable?
(Open text)
4. How has the amount of time you spend on debt-related work
changed over the past six months?
(Decreased significantly) (Decreased slightly) (No change)
(Increased slightly) (Increased significantly)✅ The measurement principle: You cannot manage what you cannot measure. Organizations that measure technical debt in the same rigor they measure revenue, costs, and customer metrics make better prioritization decisions, build stronger business cases for remediation investment, and catch debt accumulation trends before they become crises.
The Financial Framework: Technical Debt as a Balance Sheet Item
The most effective reframe for organizations that struggle to prioritize technical debt remediation is treating it explicitly as a financial liability with a measurable carrying cost and a calculable return on remediation investment.
Calculating the carrying cost of technical debt:
Inputs required:
Engineering team size: 80 engineers
Average fully loaded cost per engineer: $200,000 per year
Total annual engineering spend: $16,000,000
Estimated percentage of time on debt service: 28%
Annual debt carrying cost: $16,000,000 x 0.28 = $4,480,000
Additional inputs:
Average annual engineer turnover rate: 18%
Annual turnover attributable to debt environment: 40% of total
Average replacement cost per engineer: $280,000
Annual debt-attributable turnover cost: 80 x 0.18 x 0.40 x $280,000 = $1,612,800
Average major incidents per month: 3.2
Estimated percentage attributable to debt: 50%
Average cost per major incident: $85,000
Annual debt-attributable incident cost: 3.2 x 12 x 0.50 x $85,000 = $1,632,000
Total estimated annual debt cost: $7,724,800Calculating the return on remediation investment:
Proposed remediation: Rebuild the payment processing module
Estimated remediation cost: 4 engineers x 6 months x $100,000 per half-year = $400,000
Expected outcomes:
Velocity improvement in payment area: 40%
Annual value of velocity improvement:
Payment team is 8 engineers
8 x $200,000 x 0.40 velocity gain = $640,000 per year
Incident reduction in payment area: 60%
Current payment-related incidents: 1.4 per month
Current payment incident cost: 1.4 x 12 x $85,000 = $1,428,000 per year
Expected incident reduction savings: $1,428,000 x 0.60 = $856,800 per year
Total expected annual benefit: $1,496,800
Remediation investment: $400,000
Return on investment: 374% in year one
Payback period: 3.2 months🔑 The business case insight: When technical debt remediation is presented as a technical problem, it competes for prioritization against features, which always have a more tangible business case. When it is presented as a financial investment with a calculated ROI and payback period, it competes on the same terms as any other capital allocation decision — and the ROI on well-targeted debt remediation is typically extraordinary.
The Debt Management Framework: Prevention, Reduction, and Tolerance
Not all technical debt should be eliminated. Some debt is worth carrying because the interest cost is lower than the remediation cost. Effective debt management requires a framework for deciding which debt to prevent, which to actively reduce, and which to tolerate.
The debt quadrant:
High Business Impact Area | Low Business Impact Area | |
|---|---|---|
High Change Frequency | Remediate immediately | Remediate when touched |
Low Change Frequency | Monitor and plan remediation | Tolerate indefinitely |
Code that is changed frequently in high-impact areas has the highest debt carrying cost because the slowdown affects every change in a critical path. This debt should be remediated aggressively.
Code that is never changed and sits in a low-impact area has nearly zero debt carrying cost regardless of its quality. Rewriting working code purely for aesthetic reasons is a poor use of engineering time.
The Boy Scout Rule as a prevention mechanism:
The most effective prevention strategy for code quality debt is the Boy Scout Rule — always leave the code in better shape than you found it. When a developer touches a file to make a change, they also address the nearest and most impactful piece of debt in that file. No special project required. No coordination overhead. Debt is paid down incrementally as a natural consequence of normal development work.
The Boy Scout Rule is most effective when it is institutionalized in the engineering culture and code review process rather than left to individual discretion. Code review criteria that include debt reduction as an explicit expectation — not just functional correctness — change the default behavior without requiring separate remediation projects.
Allocating dedicated debt remediation capacity:
Engineering organizations that treat technical debt seriously allocate a fixed percentage of engineering capacity to debt remediation in every sprint — typically 15 to 20 percent. This capacity is protected from feature work pressure the same way that on-call rotation time is protected. It is not a slush fund that gets raided when a deadline approaches. It is a standing investment in the maintainability of the codebase.
Teams that do not protect this capacity consistently find that technical debt remediation happens only in the brief windows after a major release, which means it happens rarely, which means debt accumulates continuously while remediation happens episodically. The math does not work out in the debt's favor.
The debt budget — setting an acceptable ceiling:
Debt budget framework:
Define acceptable debt levels by category:
Code quality debt: under 15% of codebase has quality issues
Test coverage debt: over 80% branch coverage on critical paths
Dependency debt: no dependency more than 2 major versions behind
Architectural debt: no coupling violations in defined module boundaries
When debt exceeds budget ceiling in any category:
Trigger remediation sprint or dedicated remediation capacity increase
Pause new debt accumulation in that category pending remediation
Report debt level to engineering leadership with timeline for resolution
Review debt budget quarterly:
Measure current debt levels against budget
Adjust remediation capacity allocation based on trajectory
Update budget ceilings based on changing business prioritiesThe Rewrite Question: When Remediation Becomes Replacement
Every organization with significant technical debt eventually faces the question of whether to remediate the existing system or replace it entirely. This is one of the most consequential and most frequently mishandled engineering decisions an organization can make.
The case against rewrites — the data is not flattering:
The Standish Group found that large software rewrites have a failure rate exceeding 70 percent. Joel Spolsky's famous 2000 essay argued that rewriting working software was the single worst strategic mistake a software company could make — destroying years of accumulated bug fixes and edge case handling along with the bad code.
The risks are real and specific. Rewrites take longer than estimated, consistently and by large margins. During the rewrite, the old system continues to accumulate debt while the new system is being built. When the rewrite is complete, the new system immediately begins accumulating debt of its own — often the same debt that prompted the rewrite in the first place, because the organizational culture that created the original debt has not changed.
The case for strategic replacement — when the data supports it:
The data against rewrites applies most powerfully to like-for-like replacements — rebuilding the same system with the same architecture in a newer technology. The data is more favorable for strategic replacements that change the fundamental architecture of a system in ways that are impossible to achieve incrementally.
The strangler fig pattern provides the safest path to strategic replacement. Rather than replacing the old system in one step, new functionality is built in the new system and old functionality is progressively migrated until the old system handles nothing and can be decommissioned.
Strangler Fig Pattern — Payment Processing Example:
Phase 1 (Month 1 to 3):
New payment service built alongside old monolith
New payment method (Buy Now Pay Later) launched exclusively in new service
Old payment methods remain in monolith
Phase 2 (Month 4 to 6):
New card payment processing migrated to new service
Old card processing remains in monolith as fallback
Traffic routing: 10% new service, 90% old monolith
Gradually increase to 100% new service as confidence builds
Phase 3 (Month 7 to 9):
All remaining payment methods migrated to new service
Old monolith payment code still present but receives no traffic
New payment service handles 100% of volume
Phase 4 (Month 10):
Old payment code removed from monolith
Migration complete — no big bang cutover
No period where the system was unavailable or degraded⚠️ The rewrite decision framework:
Consider remediation when:
The core architecture is sound but the implementation is poor
The domain model is correct but the code quality is low
Incremental improvement is technically feasible
The team that built the original system is still present
Consider strategic replacement when:
The core architecture creates constraints that cannot be removed incrementally
The domain model is fundamentally wrong for current requirements
Incremental improvement would take longer than replacement
The system creates security or compliance risks that remediation cannot address
Never justify replacement by:
The code is old
The technology is unfashionable
A new engineer does not understand the existing system
The team wants to use a newer technology stackThe Organizational Playbook: Managing Technical Debt at the Leadership Level
Technical debt is ultimately an organizational problem, not a technical one. The organizations that manage it well do so through explicit governance, not through the heroic efforts of individual engineers.
Making debt visible to leadership:
Engineering leaders who want executive support for debt remediation investment need to present the business case in business terms. A presentation that leads with lines of code affected or cyclomatic complexity scores will not move the needle. A presentation that leads with annual carrying cost, velocity impact measured in feature throughput, incident rate correlation, and projected ROI of targeted remediation will.
The quarterly debt review:
Agenda for quarterly technical debt review:
1. Debt inventory update (15 minutes)
Current debt levels by category versus budget ceiling
Trajectory since last quarter — improving or deteriorating
Newly identified debt items since last review
2. Carrying cost update (10 minutes)
Updated estimate of annual debt carrying cost
Cost attribution by system area
Trend versus previous quarter
3. Remediation progress (15 minutes)
Debt items addressed in the quarter
Impact measured — velocity improvement, incident reduction
ROI of completed remediation work
4. Prioritized remediation roadmap (20 minutes)
Top 5 debt items by carrying cost and remediation ROI
Resource requirements and timeline
Decision required: capacity allocation for next quarterThe engineering culture dimension:
Governance frameworks and measurement systems are necessary but not sufficient. The organizations that sustain low debt levels over time do so because their engineering culture treats code quality as a first-class value — not a luxury that can be sacrificed when deadlines approach.
This requires explicit leadership behavior. When engineering leaders consistently prioritize quality alongside speed in their decision-making, in their code review participation, and in how they respond to missed deadlines, the message reaches every engineer on the team. When leaders consistently sacrifice quality for speed and frame it as pragmatism, the debt accumulates in alignment with that signal.
💡 The culture observation: The most debt-laden codebases in the industry were not created by engineers who wanted to write bad code. They were created by engineers who were told, explicitly or implicitly, that shipping fast was more important than shipping well — and who responded rationally to the incentives they were given. Changing the debt trajectory requires changing the incentives, not just the tooling or the process.
The True Cost, Summarized
The analysis leads to a conclusion that is simple to state and expensive to ignore.
Technical debt is a liability that carries interest. The interest rate is not fixed — it increases as the debt grows and as the organization scales. At moderate levels it is a manageable drag on velocity. At high levels it becomes the primary determinant of what an engineering organization can and cannot accomplish.
The organizations that treat it as a financial liability — measuring it, budgeting for it, calculating the ROI of remediation, and protecting capacity for continuous debt reduction — consistently outperform organizations that treat it as an inevitable background condition of software development.
The data-driven conclusions:
The average organization spends 23 to 42 percent of engineering capacity on technical debt service
The average fully loaded cost of that debt service is between $3 million and $10 million annually for a 50-person engineering organization
The ROI on well-targeted debt remediation is typically 200 to 400 percent in year one
Organizations that protect 15 to 20 percent of engineering capacity for debt remediation consistently have lower debt levels than those that address debt reactively
Debt levels above 40 percent of the codebase affected correlate with engineer turnover rates 35 to 60 percent higher than low-debt environments
Technical debt is the cost of moving fast. Technical debt is the cost of moving fast without a plan to address the consequences — and the interest rate on that cost is higher than most organizations have calculated.
The data is clear. The framework exists. The only remaining variable is whether leadership treats this as the financial liability it demonstrably is, or continues to carry it as the invisible cost it has always been.
