What a Comment-to-Code Ratio Tells You About a Legacy App

Written by DeeDee Walsh | Jun 7, 2026 2:01:46 AM

It isn't a quality score. On a twenty-year-old codebase, the comment-to-code ratio is one of the cheapest risk signals you can read if you know what you're looking at.

Every legacy modernization estimate is a bet on how much you understand the code. The comment-to-code ratio is one of the few numbers that tells you, before you've read a single line, how much of that understanding is still recoverable.

It's also a misread metric in software. Our instinct is to treat it as a report card: more comments good, fewer comments bad. And that's often the wrong way to look at it. A high ratio can be a graveyard of commented-out code. A low ratio can be clean, self-documenting code, or it can be twenty years of tribal knowledge that walked out the door with the people who wrote it.

Here's what the number is, why the common view of it is often misinterpreted, and how we read it when we're scoping a modernization.

What the ratio measures

In its plainest form it's comment lines divided by code lines, or comment lines over the sum of code plus comments. Pick a convention and stick to it; the absolute value matters less than what you compare it against.

The important caveat is baked into the word lines. A counter counts lines, not meaning. It cannot tell a load-bearing "here's why we do this insane thing or the nightly batch fails" from a ' increment the counter sitting uselessly above i = i + 1. That semantic gap is the bottom line of this metric, and it's why the raw number is a starting point rather than an answer.

ByteInsight will hand you the precise inputs including comment lines, blank lines, and non-blank content lines and in the technical report it splits lines into actual code, designer code, HTML, JavaScript, comments, and blanks. The reason that breakdown matters becomes clear the moment you try to interpret a single aggregate number.

Why the single headline number lies

Comments aren't homogeneous. A count treats all of these as the same thing, and they are not:

"Why" comments are gold. Intent, rationale, the business rule behind the weird branch, the constraint nobody would guess from the code.
"What" comments are redundant narration of code that already says what it does. Noise.
Commented-out code are dead weight, and a tell (more on that below).
Banner and license headers are fixed overhead that scales with file count, not with complexity.
TODO / FIXME / HACK / XXX are the confession log.

A 15% ratio made of "why" comments and a 15% ratio made of commented-out code are opposite situations wearing the same number.

Generated code wrecks the denominator. Designer files, scaffolding, and other machine-written output carry near-zero comments and a lot of lines. They drag the apparent ratio down on an app that may be perfectly well-commented everywhere a human actually typed. You have to separate hand-written from generated before the ratio means anything. This is the same reason an honest line count excludes generated lines from the effort estimate in the first place.

The average hides the distribution, and the distribution is the point. A 12% average can be a calm, uniform 12%, or it can be a bimodal mess: the boring CRUD screens sitting at 30% and the gnarly pricing engine at 0%. The second case is the one that hurts you in delivery, and the average erases it completely.

The two readings that matter

Near-zero comments on an old app is a knowledge-risk flag

When code is decades old and barely commented, the why was never written down. It lived in the heads of people who are gone. You can recover what the code does by reading it carefully. You usually cannot recover why it does it that way and which downstream system it's protecting, which constraint it's honoring, which 1999 bug it's quietly working around. That reverse-engineering is the expensive, slow, human part of a modernization, and a comment-free complex module is where it concentrates. Zero comments on a hard module isn't clean code. It's a black box you'll be paying to open.

Suspiciously high comments are often rot, not documentation

Walk into the high-ratio files expecting commented-out code, because that's frequently what's there. Someone used the comment character as version control ("I'll just comment this out in case we need it") and never came back. Read as a signal, a large volume of commented-out code is a fear index: it tells you the team didn't trust itself to delete, which usually means it didn't fully understand the system either. Either way, it's dead weight you don't carry to the new stack.

The TODO/FIXME/HACK confession log

Tag density across these markers is a free map of the regions the original team already knew were broken, fragile, or temporary-that-became-permanent. It costs nothing to read and it points straight at where the bodies are buried.

Stale comments are worse than none

A comment is the cheapest documentation to write and the first to go stale. In a twenty- or thirty-year-old application, comments are often the only surviving documentation and a meaningful fraction of them describe behavior that changed two rewrites ago. A wrong comment is worse than a missing one, because it actively misleads the person, or the agent, trying to understand the code. So "more comments" is not automatically "more asset." Treat old comments as claims to verify, not facts to trust.

Why this matters in an AI modernization

AI translation is genuinely good at the what: it reads code structure and produces equivalent code on the target stack. What no translator can read is intent that was never written down anywhere. The "why" comments are exactly the signal an automated pipeline can use to make a sound decision instead of a literal one.

Which makes comment quality a rough leading indicator of how much of your application sits above or below the automation line, what we call the 70% wall. A legacy app where intent is captured in good comments is more automatable. A comment-barren one forces reverse-engineering, and reverse-engineering is the manual, human, expensive work that lives on the far side of the wall. The ratio doesn't change how many lines you migrate. It changes how many of those lines are cheap.

How to read your own number

You can't read a distribution you don't have. An aggregate ratio from memory or a quick grep won't separate generated from hand-written, won't give you per-file granularity, and won't survive a folder tree with bin, obj, and a pile of third-party libraries dragging the numbers around.

ByteInsight gives you the raw material: comment lines, blank lines, and content lines per file, with designer and generated code split out, exported straight to CSV for Excel, Power BI, or Tableau. From there the read is straightforward:

Sort files by comment ratio and look hard at both tails.
Cross the low tail against file size and complexity. The big, comment-free files are your risk budget.
Eyeball the high tail for commented-out code and back it out of the count.
Exclude generated folders so the denominator is code you actually own and intend to modernize.

It's free, read-only, and runs offline. No source code leaves the building. The number it hands you isn't a grade. It's a map of where the understanding still lives in your codebase and where it has quietly gone missing. On a modernization, that map is the difference between a quote you can stand behind and a surprise six weeks in.

Point ByteInsight at a root folder and get the inventory. Then read your own distribution before anyone asks you to commit to a number. Download ByteInsight or book a call to walk through the report with our migration engineers.

View full post