8 min read

Readmissions Dropped 8%. Heart Failure Deaths Rose 19%. The Program Called It Progress.

Readmissions Dropped 8%. Heart Failure Deaths Rose 19%. The Program Called It Progress.

In 1966, Brigadier General Edward Lansdale reviewed Robert McNamara's list of metrics for measuring progress in Vietnam. Body count. Weapons captured. Territory held. Sorties flown.

Lansdale told him the list needed one more item — an "x-factor" for the feelings of the Vietnamese people.

McNamara wrote it down. Asked what it was. Then erased it. He couldn't measure it.

The war lasted nine more years. Fifty-eight thousand Americans and an estimated 3.4 million Vietnamese died. The United States won every statistical battle and lost the war. In 1995, McNamara published his memoir: "We were wrong, terribly wrong." In 2003, in the documentary The Fog of War, he said it more simply: "Rationality will not save us."

What McNamara demonstrated in that erasure — the moment where an unmeasurable reality gets deleted from consideration because no number represents it — is not a personal failure. It is a structural one. And it has killed people in at least eight domains since.

Mechanism #24

Proxy Cannibalization

When a system attaches consequences to a metric, rational actors optimize the metric. The underlying outcome degrades or is destroyed. The metric shows improvement. The system calls it progress.

The Descent

In 1972, the sociologist Daniel Yankelovich described the progression McNamara had embodied. He called it McNamara's Fallacy and mapped it as a four-step decline:

Step One

"Measure whatever can be easily measured. This is OK as far as it goes."

Step Two

"Disregard that which can't be easily measured or give it an arbitrary quantitative value. This is artificial and misleading."

Step Three

"Presume that what can't be easily measured really isn't important. This is blindness."

Step Four

"Say that what can't be easily measured really doesn't exist. This is suicide."

Charles Goodhart formalized the principle in 1975: "Any observed statistical regularity will tend to collapse once pressure is placed upon it for control purposes." Donald Campbell, independently, in 1976: "The more any quantitative social indicator is used for social decision-making, the more subject it will be to corruption pressures and the more apt it will be to distort and corrupt the social processes it is intended to monitor."

Both were ignored. Both were right. Here are eight domains that proved them.

Step One: The Reasonable Metric

In 2012, Medicare began penalizing hospitals for excessive 30-day readmission rates under the Hospital Readmissions Reduction Program. The logic was defensible: high readmission rates suggested poor discharge planning. Hospitals could lose up to 3% of Medicare inpatient revenue — averaging $217,000 per facility. The metric was clear, measurable, and tied to real consequences.

The metric worked. Readmission rates for heart failure dropped from 20.0% to 18.4%.

The outcome didn't. Thirty-day mortality for heart failure increased from 7.2% to 8.6%. One-year mortality rose from 31.3% to 36.3%. Wadhera et al., analyzing 8.3 million hospitalizations in JAMA (2018), estimated 5,200 to 10,400 excess deaths annually among heart failure patients alone. The editorial response from Fonarow was blunt: "No evidence patients have benefited from the HRRP."

The structural misalignment was elementary: readmission penalties exceeded mortality penalties under value-based purchasing. Hospitals optimized for the metric they were penalized most for. Patients who might have been readmitted were managed in observation units or sent home. Total 30-day hospital revisits — including observation stays and ED returns — actually increased.

Three camps now debate what happened. Khera et al. argue heart failure mortality was already rising before HRRP, with no significant change in slope. MedPAC's 2022-23 report concluded the program succeeded. A third analysis by Ody, Dafny, and Cutler found that 48% of the readmission reduction was overstated by coding changes — meaning HRRP may have had no meaningful effect on either readmissions or mortality. The debate is unresolved. The program continues.

In New York and Pennsylvania, Dranove et al. (Journal of Political Economy, 2003) documented the surgical version: mandatory surgeon-level mortality reporting for cardiac bypass caused hospitals to shift away from operating on the sickest patients, using less effective alternatives not covered by the report cards. Net result: higher resource use and worse health outcomes, concentrated among the patients who needed care most. Sixty-three percent of New York cardiologists reported increased difficulty finding surgeons willing to operate on severely ill patients.

Step Two: The Invisible

The VA hospital system tied manager bonuses to a 14-day appointment target. Official data showed average waits of 24 days. Veterans actually waited nearly four months. Staff at 65% of the nation's 216 VA facilities were instructed to falsify wait-time data — secret electronic waiting lists, requests entered and deleted, paper records shredded. The VA inspector general's office had issued 18 reports citing wait-time manipulation over nine years before the scandal broke publicly in 2014.

New York's CompStat system produced the same pathology in policing. Officer Adrian Schoolcraft recorded superiors demanding data manipulation. Felonies were downgraded to misdemeanors. Reports went unfiled. Victims were told "too late" or "nothing can be done." A 2014 study in Justice Quarterly found statistical evidence of systematic data manipulation. National homicide clearance rates fell from 72% in 1980 to 61% in 2024. Rape clearance: 49% to 27%. The metric improved. The clearance of actual crime did not.

Step Three: The Unmeasurable Is Unimportant

When Wells Fargo tied employee compensation to a cross-selling target of eight accounts per customer, the metric optimized beautifully — 5.9 accounts per customer in 2011, 6.1 by 2015. What couldn't be measured — customer trust, employee wellbeing, institutional integrity — was treated as immaterial. Over five years, employees created 3.5 million unauthorized accounts. The total revenue from the fraud: $5 million. The total in fines and settlements: over $3 billion. Employees described frequent crying, panic attacks, vomiting from stress. Calls to the ethics hotline resulted in termination of the caller. The bank fired 5,300 employees for unauthorized accounts between 2011 and 2016 — it knew and chose to fire individuals rather than fix the system. As of 2025, employees report that management now calls sales goals "Behaviors" and "Outcomes." The metric was renamed. The pressure was not.

In education, the National Research Council's nine-year study concluded that high-stakes standardized testing "yielded little learning progress but caused significant harm." What couldn't be measured — curriculum breadth, intrinsic motivation, deep understanding — was eliminated to make room for test preparation. David Berliner called the result "apartheid education": affluent students got richer curricula while disadvantaged students got test prep. The metric was supposed to close the equity gap. It widened it.

In universities, Columbia rose to #2 in U.S. News rankings on self-reported data that mathematician Michael Thaddeus proved was faulty. On accurate data, it dropped to #18. At Temple, the former dean of the Fox School of Business was convicted of fraud after fabricating GMAT scores to reach #1 in online MBA rankings. The ranking was the product. The education was unmeasured.

Step Four: The Erasure

Vietnam was the terminal case. McNamara imposed enemy body count as the primary metric of progress. The optimization was extraordinary. Official Department of Defense figures: 950,765 communist killed in action between 1965 and 1974. The department's own internal estimate: the figures needed 30% deflation. Historian Guenter Lewy calculated that roughly one-third of "enemy KIA" were actually civilians — approximately 220,000 misclassified civilian deaths.

Gaming was systemic. Body-part arithmetic: four limbs equaled four kills. Inter-unit competitions awarded prizes for the highest counts. Free-fire zones classified all casualties as enemy. At the Battle of Dak To, after losing 78 men and finding only 10 enemy bodies, Westmoreland's command reported the count as 475.

When Douglas Kinnard surveyed 173 generals who had commanded in Vietnam, 61% said the body count was "often inflated." Only 2% considered it a valid measure of progress.

What couldn't be measured — political legitimacy, popular sentiment, the enemy's willingness to absorb unlimited losses for a cause they believed in — was erased from the system. CIA analyst Sam Adams found that MACV was estimating 280,000 enemy when the real figure exceeded 600,000. Westmoreland forbade the CIA from publishing any count over 399,000 — acknowledging more fighters would reveal popular support for the insurgency. A fact the metric system had no field for.

"The body count was the most corrupt — and corrupting — measure of progress."

— Lewis Sorley, military historian

The Evidence Across Eight Domains

Domain Metric Metric Improved Outcome Degraded
Hospital readmissions 30-day readmission rate 20.0% → 18.4% HF mortality: 7.2% → 8.6%; est. 10,000 excess deaths/yr
VA wait times 14-day appointment target Official avg: 24 days Actual avg: ~115 days; 65% of facilities falsified data
Wells Fargo Accounts per customer 5.9 → 6.1 3.5M fake accounts; $3B+ fines from $5M revenue
Education Standardized test scores Scores improved Curriculum narrowed; equity gap widened
University rankings U.S. News self-reported data Columbia: #2 On real data: #18; Temple dean convicted of fraud
Policing (CompStat) Reported crime statistics Reported crime ↓ Clearance rates collapsed; homicide 72% → 61%
Surgical report cards Surgeon mortality rates Published mortality ↓ Sickest patients denied surgery (Dranove et al. 2003)
Vietnam Enemy body count 950,765 "enemy KIA" ~220K civilian deaths miscounted; war lost

When Can Metrics Work?

The nihilist reading is that all metrics fail. The evidence says otherwise — but only under specific structural conditions.

After the Bristol Royal Infirmary scandal (1984-1995), where 30-35 excess pediatric cardiac deaths went undetected for a decade, the UK mandated surgeon-level mortality reporting for cardiac surgery. Unlike the NY/PA report cards, the UK system continuously improved its risk adjustment to account for patient complexity — and gave clinicians professional ownership of the data rather than attaching punitive consequences. Result: a genuine 25% reduction in mortality without the catastrophic cherry-picking seen in New York.

The structural variable is what I'll call Goodhart resistance — the degree to which faking the metric costs more than genuinely improving the outcome:

Reform Worked
UK Cardiac Surgery
Hard endpoint (death). Professional ownership. Evolving risk adjustment.
Partial Reform
VA Wait Times
Decoupled from bonuses. Expanded capacity instead.
Unproven
Education (ESSA)
Multi-measure dashboards. Too early to assess.
Failed
ESG Scores
No agreed definition. Subjective ratings. Massive gaming incentives.

The harder the endpoint, the more reformable the system. Death is hard to fake. Wait times are easy. ESG impact has no agreed-upon definition. Goodhart resistance is not binary — it exists on a spectrum, and the position on that spectrum is predictable before the system is implemented.

Campbell published his law in 1976. Goodhart in 1975. Jerry Muller's The Tyranny of Metrics documented the pathology across institutions in 2018. Every domain that attached consequences to a quantitative metric without designing for Goodhart resistance reproduced the same failure — not because the warnings didn't exist, but because the institutions implementing metrics didn't read them.

The mechanism isn't that metrics always fail. It's that attaching consequences to metrics without structural safeguards predictably destroys the underlying outcome — and the destruction was predicted half a century ago by people whose work is freely available and almost universally unread by the people designing the systems.

"Rationality will not save us."

— Robert McNamara, The Fog of War, 2003