Case File
In November 2023, Nature Human Behaviour published a paper titled "High replicability of newly discovered social-behavioural findings is achievable." The paper claimed that when researchers followed rigour-enhancing practices — preregistration, large samples, methodological transparency — replication rates jumped to 86%. It was celebrated as proof that the replication crisis had a solution.
Among the authors was Brian Nosek, executive director of the Center for Open Science, the person who had done more than anyone alive to build the infrastructure for preregistration and open science reform.
In October 2024, the paper was retracted. The reason: it was not preregistered. The authors had selected outcome measures with knowledge of the data. The paper proving the cure worked had failed to follow the cure.
This is not a story about a field that doesn't know it's broken. Psychology knows exactly how it's broken. It has quantified the failure rate. It has invented the fix. It has proven the fix works. And after a decade of institutional reform, 93% of the field still doesn't use it.
Mechanism #18 in the Taxonomy of Knowledge Failure: Diagnosed Paralysis — when a system achieves complete self-knowledge of its failure and still cannot change.
The Diagnosis
Psychology's self-examination has been ruthless. The numbers it generated about its own dysfunction are among the most replicated findings in all of meta-science.
96%
of standard psychology papers report positive, statistically significant results
44%
of Registered Reports — where journals commit before results are known — report positive results
Same source, same field, different incentive structure
That 52-percentage-point gap is not subtle. It implies roughly half of all positive findings in the standard literature are artifacts of selective reporting — researchers running multiple analyses, dropping inconvenient conditions, reframing hypotheses after seeing the data. The field's own diagnostic tools proved this. The Reproducibility Project in 2015 attempted to replicate 100 studies from top journals. Only 36% replicated. Effect sizes averaged half their original magnitude. Social psychology replicated at just 25%.
This has been known, in progressively sharper resolution, for decades. Sterling et al. measured 95.6% positive results in 1995. Fanelli found 91.5% in 2010. Psychology didn't discover it had a problem in 2015. It confirmed, with great fanfare, what it had already documented and ignored.
The Casualties
These are not obscure findings. They were bestselling books, TED talks, corporate training programs, and undergraduate textbooks.
| Finding | Cultural Reach | Replication |
|---|---|---|
| Ego depletion — willpower is a depletable resource | 3,000+ citations. Foundation of self-control research for a decade. | Near-zero effect in Vohs et al. 2021 multisite test |
| Power posing — standing in an expansive pose raises testosterone, lowers cortisol | 70M+ views on Amy Cuddy's TED talk. Corporate workshops worldwide. | No hormonal effects in Ranehill et al. 2015 |
| Social priming — reading "elderly" words makes you walk slower | Cornerstone of Kahneman's Thinking, Fast and Slow. He later called it a "train wreck." | Failed in Doyen et al. 2012. Kahneman acknowledged the failure publicly. |
| Growth mindset — believing intelligence is malleable improves academic outcomes | School curricula in 40+ countries. Multi-million-dollar training industry. | Effect exists but is ~0.05 SD — negligible in practice |
Only 12% of post-failure citations acknowledge that these findings didn't replicate. The other 88% cite them as if they still hold. The textbooks update slowly, when they update at all.
The Proven Fix
Registered Reports — where journals review and accept papers before data collection — eliminate publication bias almost entirely. Over 300 journals have adopted the format. When used, positive result rates drop from 96% to 44%, meaning the remaining positive findings are far more likely to be real. Preregistration, where researchers commit to methods and analysis plans in advance, has infrastructure through the Open Science Framework, with over 12,000 registrations per year by 2017.
The cure works. This is not contested. When researchers precommit to their analysis, the selective reporting that inflates positive results disappears. The question is not whether the cure works. It is why almost no one takes it.
The Paralysis
Here is what a decade of reform has produced:
Hardwicke et al. 2024. Up from ~3% in 2014.
Better — but still not a majority even at the top.
Crede & Sotola 2024. 1 of 244 studies.
Most deviations undisclosed. Preregistration without compliance is theater.
The adoption curve is not exponential. It is not even linear. From 3% to 7% in eight years. At this rate, majority adoption is decades away — if it comes at all.
Why Knowing Doesn't Help
The incentive structure that created the crisis is the same one that blocks the cure.
Papers that fail to replicate receive 153 more citations than papers that do replicate. Not 153%. One hundred and fifty-three additional citations — roughly 16 times more per year. A researcher who preregisters, runs a large sample, and reports a null result does the right thing. They also publish a paper that nobody cites, that doesn't help their tenure case, that doesn't generate media coverage, and that doesn't lead to speaking invitations or grant renewals.
A 2025 analysis of 240,355 psychology articles found that researchers at the highest-ranked universities publish articles with the weakest p-values. The career ladder rewards fragile findings. Robust methods are individually irrational — they produce less publishable, less citable, less career-advancing work.
The field is not failing because it lacks self-awareness. It is failing because the incentive structure that created the problem also makes the solution a career risk. Everyone knows the building is on fire. The exits are marked. But the fire is warm and the exits lead to the cold.
The Mechanism
Mechanism #18: Diagnosed Paralysis
When a knowledge system correctly identifies its own failure mode, develops a proven cure, and cannot adopt the cure at scale because the incentive structure that caused the failure also makes the cure individually irrational.
This is different from a field that doesn't know it's broken (Mechanism #3). Different from a field where the cure and the disease are the same technology (Mechanism #12). Here, the system has achieved complete self-knowledge. It can quantify its own failure rate to the decimal. It has built the tools. It has proven they work. And it still can't stop — because individual incentives overwhelm collective rationality.
Psychology is not unique in this. It is simply the field that has generated the most precise documentation of its own ongoing dysfunction. The question it answers is not "Can science self-correct?" but something more uncomfortable: What happens when a system diagnoses its disease, proves the cure, and the disease is more rewarding than the cure?
The answer, so far, is that it publishes papers about the problem. Those papers get cited. And the crisis continues.