Rigor, Reproducibility, and Trust in Science
A deep dive into Gold Standard Science and the NIH perspective
As an epidemiologist, I live in a world obsessed with bias. We are trained to ask what could be wrong, what might have slipped through, and how results could be misleading. It makes us both the best and the worst co-authors. We insist on a long limitations section that catalogues every possible flaw. Rigor and reproducibility were drilled into me from the beginning, and at my center, rigor and high-quality work are values we’ve always upheld. To me, this is part of the fabric of science.
When you write NIH grants, rigor and reproducibility are key pieces of the application. Reviewers scrutinize them, and plenty of grants get dinged because the case isn’t strong enough or simply not described well. At study section, it’s always a point of discussion. Of course, bias still creeps in, sometimes through deference to “famous” scientists (see my earlier post on The Amplifier Model), but generally, the expectation of rigor is built into the system.
So where is all this renewed focus on rigor and reproducibility at NIH coming from?
A Political Backdrop
Watching Bhattacharya interviews and reading NIH’s new Gold Standard Science framework, the emphasis is clear: more retractions, irreproducible findings in high-profile journals, and declining public trust in science.
The Trump administration framed it this way: “We must restore the American people’s faith in the scientific enterprise and institutions that create and apply scientific knowledge in service of the public good.”
I won’t dive into the politics here. But I agree that public trust in science is shaken, and in this moment, perception is reality.
Retractions and Reproducibility
Let’s look at the reality. Retractions are increasing. A few recent papers highlight the trends:
Who retracts? Scientists with younger publication age, higher self-citation rates, and larger publication volumes.
How common? Roughly 4% of top-cited scientists have at least one retraction. More than half are in medicine and life sciences, with especially high rates in alternative medicine, cancer, and pharmacology.
Where? Most retractions come from China and other developing countries…relatively few from the U.S.
Figures are from this paper: https://www.nature.com/articles/d41586-025-00455-y
And what about reproducibility? A Science blog analyzed replication efforts in cancer biology:
Effect sizes were, on average, 85% smaller than originally reported.
Negative results replicated 80% of the time, but positive results only 40%.
Out of 53 studies, only 5 were fully reproducible.
These aren’t abstract problems. When findings can’t be reproduced, they send entire fields down the wrong track.
Why Findings Fail to Reproduce
A Nature blog outlined six major factors:
Limited access to data, methods, materials
Problems with biological materials (misidentified or contaminated cell lines)
Challenges with complex datasets
Poor research practices and design
Cognitive bias (confirmation, reporting, selection bias)
Competitive research culture (incentives for novelty over rigor)
None of this will surprise practicing scientists but codifying it helps frame solutions.
Side note: I’ll never forget one project where an extra minus sign in the code multiplied a variable by –1. I checked the code repeatedly, had a co-author check too, and still missed it until late in the review process. It could have been my first retraction, all over a stray keystroke. Luckily, we caught it in time. But it’s a reminder: responsibility is real and even small errors can have significant impact on results.
The NIH Response: Gold Standard Science
The new NIH document is dense: 16 pages packed with links, programs, and initiatives. It reads like someone just cleaned out a messy closet - lots of organizing, clarifying what the NIH has been doing, adding structure and new programs.
Nine pillars of Gold Standard Science:
Reproducible
Transparent
Communicative of error and uncertainty
Collaborative and interdisciplinary
Skeptical of findings and assumptions
Structured for falsifiability
Subject to unbiased peer review
Accepting of negative results
Without conflicts of interest
To me, these read like an epidemiologist’s to-do list. Logical, foundational, and familiar.
New and notable initiatives (find links to these initiatives in the PDF document below):
Simplified peer review – these criteria changed in January 2025 (“rigor and feasibility” is one of three pillars of the review now)
Transparency push – new public access rules to get publications viewable by the public right after publication went live July 1, 2025
Replication Initiative – targeted funding will be set aside for replication studies (Request for Applications on the website)
RIGOR program for dietary supplements – the first of a series of planned programs to create guidance and training resources on these topics
ARRIVE guidelines – This is a checklist for publications (and grants) for animal/in vivo experiments (we have had this in epidemiology and clinical trials for almost two decades – all of my trainees are very aware of the STROBE guidelines)
Protecting academic freedom – This enhances the focus on the importance of academic freedom and brings this policy to intramural research at NIH (the research done internally at NIH). This also reduces the NIH’s obligations in reviewing the content of scientific data produced to just a regulatory review.
Team science – Enhanced focus on team science with props given to the Accelerating Medicines Partnership initiative, a partnership between the NIH and foundation of the NIH which brings in industry support for foundational studies in chronic diseases (I’m an AMP investigator so I was excited to see this!). The goal is to do more work like this including more cross-HHS collaborations and there’s a description of some of these that are kicking off: real-world data network development (mostly in Autism right now, a partnership with CMMS), nutrition science program, human-based research technologies.
Science of Science Scholars Program – This is a new program to allow scientists to utilize NIH administrative data for rigorous policy analyses (Request for Applications on the website)
Preprint pilot – encouraging preprints, including negative results, to be posted so that they can be searched and identified as a part of study planning.
Modernized and agreed upon terminology – A new NIH/FDA glossary for clinical research terms was published and will allow study designs, analysis and other study related terms to be used in the same way across all groups.
Conflict of interest – new disclosures and training for NIH staff and researchers
This is a massive amount of work already and it creates a framework for where the NIH is headed.
Lessons from Industry
One observation: when working with industry, their rigor in documenting every step is far greater than in academia. It’s maddening at times… endless questions, delays, paperwork … but it forces assumptions to be tested constantly. In academia, we rely more on trust, particularly with trainees, which leaves space for errors to creep in. More documentation could feel heavy, but it would also improve reproducibility and even strengthen teaching.
The Costs
Data sharing, standardization, replication, and all the other pieces outlined costs money. Prepping data for sharing requires people and resources. NIH has not hinted at increases to grant budgets … and at the same time is adjusting indirects downward…. That tension is real and unsolved at the moment.
Where Do We Go From Here?
The NIH says it wants to build a “culture of constructive skepticism.” I agree. The actual principles aren’t new, but the emphasis is worth paying attention to.
For me, the takeaways are:
We need to communicate better with the public about rigor and reproducibility.
We should keep training students to live these principles.
We could adapt some practices from industry to strengthen rigor and transparency.
NIH is moving quickly with lots of changes so I’m sure there’s more to discuss!
At the end of the day, skepticism is healthy and necessary. Science is stronger when we’re skeptical and ask probing questions.
Overall, I think this hopefully suggests that we will keep moving forward and strengthening science at the NIH. (And more headlines already to address for another time…)
References:
Leading in Gold Standard Science: an NIH Implementation Plan (“the PDF” mentioned above): www.nih.gov/sites/default/files/2025-08/2025-gss.pdf
White House Announcement about Gold Standard Science: https://www.whitehouse.gov/presidential-actions/2025/05/restoring-gold-standard-science/
Majority of retractions are from high impact journals (an oldie but goodie): https://pmc.ncbi.nlm.nih.gov/articles/PMC3187237/
Record number of retractions in 2023: https://www.nature.com/articles/d41586-023-03974-8
Who is retracting their papers?: https://pmc.ncbi.nlm.nih.gov/articles/PMC11781634/
Nature analysis of retractions including the two figures above: https://www.nature.com/articles/d41586-025-00455-y
How much of that great paper is real? Science Blog https://www.science.org/content/blog-post/how-much-great-new-paper-real
Factors affecting reproducibility: https://www.nature.com/articles/d42473-019-00004-y
Common Fund Replication Initiative: https://commonfund.nih.gov/replication-initiative
Modernizing Research and Evidence Consensus Definitions: https://jamanetwork.com/journals/jamanetworkopen/fullarticle/2835400
ARRIVE guidelines: https://arriveguidelines.org/arrive-guidelines
STROBE guidelines and all of the others that have been around forever: https://www.equator-network.org/reporting-guidelines/strobe/
Promoting academic freedom: https://www.nih.gov/about-nih/nih-director/statements/nih-reviews-policies-promote-academic-freedom
The Supreme Court ruling last week:
https://www.nature.com/articles/d41586-025-02721-5
https://www.scotusblog.com/2025/08/supreme-court-allows-trump-administration-to-terminate-783-million-in-nih-grants-linked-to-dei-initiatives/
My disclosures:
I am an academic rheumatologist, epidemiologist, and mom. My research is funded by the NIH, private foundations, pharmaceutical companies, and philanthropy. I consult for and work with pharmaceutical companies in my research. I am co-founder of a non-profit organization and CEO/founder of Research Pathfinder, LLC. My thoughts are my own and not reflective of my employer.





Fantastic summary! It's great to connect with others here who are thinking alike!
Your four takeaway points for moving forward are so crucial, in regards to better scientific communication and training. Right now, the goal of grad school is generally to train future PI's, but not everybody needs to be a PI. The scientific ecosystem has numerous other roles and we need to cultivate scientists to do them. I recently wrote more about that in Nature: https://www.nature.com/articles/d41586-025-02258-7
I'm hearing many more voices advocating for such a paradigm shift. I hope we can make a difference!