Home

Awesome

Awesome Reproducible Research Awesome DOI

<meta property="og:image" content="https://github.com/leipzig/awesome-reproducible-research/blob/master/rrlogo2.png?raw=true"/> <img src="https://github.com/leipzig/awesome-reproducible-research/blob/master/rrlogo2tr.png?raw=true" align="right" width="200">

A curated list of reproducible research case studies, projects, tutorials, and media

Contents

Case studies

The term "case studies" is used here in a general sense to describe any study of reproducibility. A reproduction is an attempt to arrive at comparable results with identical data using computational methods described in a paper. A refactor involves refactoring existing code into frameworks and other reproducibility best practices while preserving the original data. A replication involves generating new data and applying existing methods to achieve comparable results. A robustness test applies various protocols, workflows, statistical models or parameters to a given data set to study their effect on results, either as a follow-up to an existing study or as a "bake-off". A census is a high-level tabulation conducted by a third party. A survey is a questionnaire sent to practitioners. A case narrative is an in-depth first-person account. An independent discussion utilizes a secondary independent author to interpret the results of a study as a means to improve inferential reproducibility.

<table id="case_studies"> <tbody> <tr> <td> <p> Study <meta property="embeddedDate"> </p> </td> <td> <p> Field </p> </td> <td> <p> Approach </p> </td> <td> <p> Size </p> </td> </tr> <tr> <td> <p> <a href="https://doi.org/10.1136/bmj.39590.732037.47"><span title="What is missing from descriptions of treatment in trials and reviews?">Glasziou et al <meta property="datePublished" content="2008-06-26">2008</span></a> </p> </td> <td> <p> Medicine </p> </td> <td> <p> Census </p> </td> <td> <p> 80 studies </p> </td> </tr> <tr> <td> <p> <a href="https://doi.org/10.1214/09-AOAS291"><span title="DERIVING CHEMOSENSITIVITY FROM CELL LINES: FORENSIC BIOINFORMATICS AND REPRODUCIBLE RESEARCH IN HIGH-THROUGHPUT BIOLOGY">Baggerly &amp; Coombes <meta property="datePublished" content="2009-09-01">2009</span></a> </p> </td> <td> <p> Cancer biology </p> </td> <td> <p> Refactor </p> </td> <td> <p> 8 studies </p> </td> </tr> <tr> <td> <p> <a href="https://doi.org/10.1002/bimj.200900154"><span title="Biometrical Journal and Reproducible Research">Hothorn et al. <meta property="datePublished" content="2009-08-17">2009</span></a> </p> </td> <td> <p> Biostatistics </p> </td> <td> <p> Census </p> </td> <td> <p> 56 studies </p> </td> </tr> <tr> <td> <p> <a href="https://doi.org/10.1038/ng.295"><span title="Repeatability of published microarray gene expression analyses">Ioannidis et al <meta property="datePublished" content="2009-01-28">2009</span></a> </p> </td> <td> <p> Genetics </p> </td> <td> <p> Reproduction </p> </td> <td> <p> 18 studies </p> </td> </tr> <tr> <td> <p> <a href="http://www.uio.no/studier/emner/matnat/ifi/INF5700/h11/undervisningsmateriale/Anda.Sj%C3%B8berg.Mockus.TSE.May.2009.pdf">Anda et al <meta property="datePublished" content="2009-07-30">2009</a> </p> </td> <td> <p> Software engineering </p> </td> <td> <p> Replication </p> </td> <td> <p> 4 companies </p> </td> </tr> <tr> <td> <p> <a href="https://infoscience.epfl.ch/record/136640/files/VandewalleKV09.pdf">Vandewalle et al <meta property="datePublished" content="2009-04-22">2009</a> </p> </td> <td> <p> Signal processing </p> </td> <td> <p> Census </p> </td> <td> <p> 134 papers </p> </td> </tr> <tr> <td> <p> <a href="https://doi.org/10.1038/nrd3439-c1">Prinz <meta property="datePublished" content="2011-08-31">2011</a> </p> </td> <td> <p> Biomedical sciences </p> </td> <td> <p> Survey </p> </td> <td> <p> 23 PIs </p> </td> </tr> <tr> <td> <p> <a href="https://academic.oup.com/bib/article/12/3/288/258098/Case-studies-in-reproducibility">Horthorn &amp; Leisch <meta property="datePublished" content="2011-01-28">2011</a> </p> </td> <td> <p> Bioinformatics </p> </td> <td> <p> Census </p> </td> <td> <p> 100 studies </p> </td> </tr> <tr> <td> <p> <a href="https://www.nature.com/nature/journal/v483/n7391/full/483531a.html">Begley &amp; Ellis <meta property="datePublished" content="2012-03-29">2012</a> </p> </td> <td> <p> Cancer biology </p> </td> <td> <p> Replication </p> </td> <td> <p> 53 studies </p> </td> </tr> <tr> <td> <p> <a href="http://reproducibility.cs.arizona.edu/tr.pdf">Collberg et al <meta property="datePublished" content="2014-03-21">2014</a><br/><a href="https://sci-hub.se/10.1145/2812803">Collberg &amp; Proebsting 2016</a> </p> </td> <td> <p> Computer science </p> </td> <td> <p> Census </p> </td> <td> <p> 613 papers </p> </td> </tr> <tr> <td> <p> <a href="http://science.sciencemag.org/content/349/6251/aac4716">OSC <meta property="datePublished" content="2015-08-28">2015</a> </p> </td> <td> <p> Psychology </p> </td> <td> <p> Replication </p> </td> <td> <p> 100 studies </p> </td> </tr> <tr> <td> <p> <a href="https://f1000research.com/articles/4-134/v2">Bandrowski et al <meta property="datePublished" content="2015-05-29">2015</a> </p> </td> <td> <p> Biomedical sciences </p> </td> <td> <p> Census </p> </td> <td> <p> 100 papers </p> </td> </tr> <tr> <td> <p> <a href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4555355/">Patel et al <meta property="datePublished" content="2015-06-06">2015</a> </p> </td> <td> <p> Epidemiology </p> </td> <td> <p> Robustness test </p> </td> <td> <p> 417 variables </p> </td> </tr> <tr> <td> <p> <a href="https://doi.org/10.17016/FEDS.2015.083">Chang et al <meta property="datePublished" content="2015-10-05">2015</a> </p> </td> <td> <p> <span title="Is Economics Research Replicable? Sixty Published Papers from Thirteen Journals Say &quot;Usually Not&quot;">Economics</span> </p> </td> <td> <p> Reproduction </p> </td> <td> <p> 67 papers </p> </td> </tr> <tr> <td> <p> <a href="https://doi.org/10.1371/journal.pbio.1002333">Iqbal et al <meta property="datePublished" content="2016-01-04">2016</a> </p> </td> <td> <p> <span title="Reproducible Research Practices and Transparency across the Biomedical Literature">Biomedical sciences</span> </p> </td> <td> <p> Census </p> </td> <td> <p> 441 papers </p> </td> </tr> <tr> <td> <p> <a href="https://doi.org/10.1038/533452a"><span title="1,500 scientists lift the lid on reproducibility">Baker <meta property="datePublished" content="2016-05-26">2016</span></a> </p> </td> <td> <p> Science </p> </td> <td> <p> <span title="More than 70% of researchers have tried and failed to reproduce another scientist's experiments, and more than half have failed to reproduce their own experiments. Those are some of the telling figures that emerged from Nature's survey of 1,576 researchers who took a brief online questionnaire on reproducibility in research.">Survey</span> </p> </td> <td> <p> 1,576 researchers </p> </td> </tr> <tr> <td> <p> <a href="https://pdfs.semanticscholar.org/edd7/e68711955cbbdb6dd6866db2ec8a6ff18278.pdf">Névéol et al <meta property="datePublished" content="2016-11-05">2016</a> </p> </td> <td> <p> NLP </p> </td> <td> <p> Replication </p> </td> <td> <p> 3 studies </p> </td> </tr> <tr> <td> <p> <a href="https://elifesciences.org/articles/23383#abstract">Reproducibility Project <meta property="datePublished" content="2017-01-19">2017</a> </p> </td> <td> <p> Cancer biology </p> </td> <td> <p> Replication </p> </td> <td> <p> 9 studies </p> </td> </tr> <tr> <td> <p> <a href="https://peerj.com/articles/3208/">Vasilevsky et al <meta property="datePublished" content="2017-04-25">2017</a> </p> </td> <td> <p> Biomedical sciences </p> </td> <td> <p> Census </p> </td> <td> <p> 318 journals </p> </td> </tr> <tr> <td> <p> <a href="http://www.practicereproducibleresearch.org/">Kitzes et al <meta property="datePublished" content="2017-10-17">2017</a> </p> </td> <td> <p> Science </p> </td> <td> <p> Case narrative </p> </td> <td> <p> 31 PIs </p> </td> </tr> <tr> <td> <p> <a href="http://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1005755">Barone et al <meta property="datePublished" content="2017-10-19">2017</a> </p> </td> <td> <p> Biological sciences <p> </td> <td> <p> Survey </p> </td> <td> <p> 704 PIs </p> </td> </tr> <tr> <td> <p> <a href="https://www.biorxiv.org/content/biorxiv/early/2017/10/31/143503.full.pdf?%3Fcollection=">Kim &amp; Dumas <meta property="datePublished" content="2017-10-31">2017</a> </p> </td> <td> <p> Bioinformatics </p> </td> <td> <p> Refactor </p> </td> <td> <p> 1 study </p> </td> </tr> <tr> <td> <p> <a href="http://science.sciencemag.org/content/351/6280/1433">Camerer <meta property="datePublished" content="2016-03-25">2017</a> </p> </td> <td> <p> Economics </p> </td> <td> <p> Replication </p> </td> <td> <p> 18 studies </p> </td> </tr> <tr> <td> <p> <a href="https://openreview.net/pdf?id=By4l2PbQ-">Olorisade <meta property="datePublished" content="2017-08-06">2017</a> </p> </td> <td> <p> Machine learning </p> </td> <td> <p> Census </p> </td> <td> <p> 30 studies </p> </td> </tr> <tr> <td> <p> <a href="https://doi.org/10.1515/opar-2017-0019">Strupler &amp; Wilkinson <meta property="datePublished" content="2017-11-14">2017</a> </p> </td> <td> <p> Archaeology </p> </td> <td> <p> Case narrative </p> </td> <td> <p> 1 survey </p> </td> </tr> <tr> <td> <p> <a href="https://arxiv.org/abs/1801.05042">Danchev et al <meta property="datePublished" content="2018-01-15">2017</a> </p> </td> <td> <p> Comparative toxicogenomics </p> </td> <td> <p> Census </p> </td> <td> <p> 51,292 claims in 3,363 papers </p> </td> </tr> <tr> <td> <p> <a href="https://www.researchgate.net/publication/326450530_State_of_the_Art_Reproducibility_in_Artificial_Intelligence">Kjensmo &amp; Gundersen <meta property="datePublished" content="2018-04-25">2018</a> </p> </td> <td> <p> Artificial intelligence </p> </td> <td> <p> Census </p> </td> <td> <p> 400 papers </p> </td> </tr> <tr> <td> <p> <a href="https://www.nature.com/articles/d41586-018-02108-9">Gertler et al <meta property="datePublished" content="2018-02-21">2018</a> </p> </td> <td> <p> Economics </p> </td> <td> <p> Census </p> </td> <td> <p> 203 papers </p> </td> </tr> <tr> <td> <p> <a href="http://www.pnas.org/content/115/11/2584">Stodden et al <meta property="datePublished" content="2018-03-13">2018</a> </p> </td> <td> <p> Computational science </p> </td> <td> <p> Reproduction </p> </td> <td> <p> 204 papers, 180 authors </p> </td> </tr> <tr> <td> <p> <a href="http://doi.org/10.1371/journal.pone.0213013">Madduri et al <meta property="datePublished" content="2019-04-09">2018</a> </p> </td> <td> <p> Genomics </p> </td> <td> <p> Case narrative </p> </td> <td> <p> 1 study </p> </td> </tr> <tr> <td> <p> <a href="https://doi.org/10.1038/s41562-018-0399-z">Camerer et al <meta property="datePublished" content="2018-08-27">2018</a> </p> </td> <td> <p> Social sciences </p> </td> <td> <p> Replication </p> </td> <td> <p> 21 papers </p> </td> </tr> <tr> <td> <p> <a href="http://doi.org/10.1177/2515245917747646">Silberzahn et al <meta property="datePublished" content="2018-08-23">2018</a> </p> </td> <td> <p> Psychology </p> </td> <td> <p> Robustness test </p> </td> <td> <p> One data set, 29 analyst teams </p> </td> </tr> <tr> <td> <p> <a href="https://doi.org/10.1002/bimj.201700243">Boulesteix et al <meta property="datePublished" content="2018-08-01">2018</a> </p> </td> <td> <p> Medicine and health sciences </p> </td> <td> <p> Census </p> </td> <td> <p> 30 papers </p> </td> </tr> <tr> <td> <p> <a href="https://doi.org/10.7554/eLife.34364">Eaton et al <meta property="datePublished" content="2018-10-03">2018</a> </p> </td> <td> <p> Microbiome immuno oncology </p> </td> <td> <p> Replication </p> </td> <td> <p> 1 paper </p> </td> </tr> <tr> <td> <p> <a href="https://www.biorxiv.org/content/early/2018/11/08/463927">Vaquero-Garcia et al <meta property="datePublished" content="2018-11-08">2018</a> </p> </td> <td> <p> Bioinformatics </p> </td> <td> <p> Refactor and test of robustness </p> </td> <td> <p> 1 paper </p> </td> </tr> <tr> <td> <p> <a href="https://doi.org/10.1371/journal.pbio.2006930">Wallach et al <meta property="datePublished" content="2018-11-20">2018</a> </p> </td> <td> <p> Biomedical Sciences </p> </td> <td> <p> Census </p> </td> <td> <p> 149 papers </p> </td> </tr> <tr> <td> <p> <a href="https://broad.io/ASHG2018">Miller et al <meta property="datePublished" content="2018-10-18">2018</a> </p> </td> <td> <p> Bioinformatics </p> </td> <td> <p> Synthetic replication & refactor </p> </td> <td> <p> 1 paper </p> </td> </tr> <tr> <td> <p> <a href="https://doi.org/10.1080/13658816.2018.1508687">Konkol et al <meta property="datePublished" content="2018-04-09">2018</a> </p> </td> <td> <p> Geosciences </p> </td> <td> <p> Survey, Reproduction </p> </td> <td> <p> 146 scientists, 41 papers </p> </td> </tr> <tr> <td> <p> <a href="http://amid.fish/reproducing-deep-rl">Rahtz <meta property="datePublished" content="2018-04-06">2018<a> </p> </td> <td> <p> Reinforcement Learning </p> </td> <td> <p> Reproduction, case narrative </p> <td> <p> 1 paper </p> </td> </p> </tr> <tr> <td> <p> <a href="https://doi.org/10.1145/3214239.3214242">Stodden et al <meta property="datePublished" content="2018-06-12">2018</a> </p> </td> <td> <p> <span title="Enabling the Verification of Computational Results">Computational physics</span> </p> </td> <td> <p> Census </p> </td> <td> <p> 306 papers </p> </td> </tr> <tr> <td> <p> <a href="https://peerj.com/articles/cs-163/">AlNoamany & Borghi <meta property="datePublished" content="2018-09-17">2018</a> </p> </td> <td> <p> Science & Engineering </p> </td> <td> <p> Survey </p> </td> <td> <p> 215 participants </p> </td> </tr> <tr> <td> <p> <a href="https://doi.org/10.1155/2018/4789035 ">Li et al <meta property="datePublished" content="2018-09-27">2018</a> </p> </td> <td> <p> Nephrology </p> </td> <td> <p> Robustness test </p> </td> <td> <p> 1 paper </p> </td> </tr> <tr> <td> <p><a href="https://dash.harvard.edu/bitstream/handle/1/38811561/CHEN-SENIORTHESIS-2018.pdf?sequence=3 ">Chen <meta property="datePublished" content="2018-06-29">2018</a> </p> </td> <td> <p> Social sciences & other </p> </td> <td> <p> Census </p> </td> <td> <p> 810 Dataverse studies </p> </td> </tr> <tr> <td> <p> <a href="https://doi.org/10.1038/s41597-022-01143-6">Trisovic et al <meta property="datePublished" content="2021-03-23">2021</a> </p> </td> <td> <p> <span title="A large-scale study on research code quality and execution">Social sciences & other</span> </p> </td> <td> <p> Census, Survey </p> </td> <td> <p> 2109 replication datasets </p> </td> </tr> <tr> <td> <p> <a href="https://doi.org/10.7717/peerj.5072">Nüst et al <meta property="datePublished" content="2018-07-13">2018</a> </p> </td> <td> <p> <span title="Reproducible research and GIScience: an evaluation using AGILE conference papers">GIScience/Geoinformatics</span> </p> </td> <td> <p> Census, Survey </p> </td> <td> <p> 32 papers, 22 participants </p> </td> </tr> <tr> <td> <p> <a href="https://doi.org/10.1038/s41467-018-05627-1">Raman et al <meta property="datePublished" content="2018-08-07">2018</a> </p> </td> <td> <p> <span title="Apparent bias toward long gene misregulation in MeCP2 syndromes disappears after controlling for baseline variations">Genomics</span> </p> </td> <td> <p> Robustness test </p> </td> <td> <p> 8 studies </p> </td> </tr> <tr> <td> <p> <a href="https://doi.org/10.1038/sdata.2019.30">Stagge et al <meta property="datePublished" content="2019-02-26">2019</a> </p> </td> <td> <p> Geosciences </p> </td> <td> <p> Survey </p> </td> <td> <p> 360 papers </p> </td> </tr> <tr> <td> <p> <a href="https://doi.org/10.1371/journal.pcbi.1006269">Bizzego et al <meta property="datePublished" content="2019-03-27">2019</a> </p> </td> <td> <p> Deep learning </p> </td> <td> <p> Robustness test </p> </td> <td> <p> 1 analysis </p> </td> </tr> <tr> <td> <p> <a href="https://doi.org/10.1371/journal.pone.0213013">Madduri et al <meta property="datePublished" content="2019-04-11">2019</a> </p> </td> <td> <p> Genomics </p> </td> <td> <p> Case narrative </p> </td> <td> <p> 1 analysis </p> </td> </tr> <tr> <td> <p> <span title="Creating reproducible pharmacogenomic analysis pipelines"><a href="10.1038/s41597-019-0174-7">Mammoliti et al <meta property="datePublished" content="2019-09-03">2019</a></span> </p> </td> <td> <p> Pharmacogenomics </p> </td> <td> <p> Case narrative </p> </td> <td> <p> 2 analyses </p> </td> </tr> <tr> <td> <p><a href="https://doi.org/10.1371/journal.pbio.3000246"><span title="Open science challenges, benefits and tips in early career and beyond">Allen & Mehler <meta property="datePublished" content="2019-05-01">2019</span></a> </p> </td> <td> <p> Biomedical sciences and Psychology </p> </td> <td> <p> Census </p> </td> <td> <p> 127 registered reports </p> </td> </tr> <tr> <td> <p><a href="https://doi.org/10.1109/MSR.2019.00077"><span title="A Large-scale Study about Quality and Reproducibility of Jupyter Notebooks">Pimentel et al <meta property="datePublished" content="2019-05-07">2019</span></a> </p> </td> <td> <p> All </p> </td> <td> <p> Census </p> </td> <td> <p> 1,159,166 Jupyter notebooks </p> </td> </tr> <tr> <td> <p><a href="https://doi.org/10.1016/j.omto.2019.05.004"><span title="Assessing the Completeness of Reporting in Preclinical Oncolytic Virus Therapy Studies">Fergusson et al <meta property="datePublished" content="2019-05-20">2019</span></a> </p> </td> <td> <p> Virology </p> </td> <td> <p> Census </p> </td> <td> <p> 236 papers </p> </td> </tr> <tr> <td> <p><a href="https://doi.org/10.1016/j.bja.2019.01.012 "><span title="Hypnotic depth and postoperative death: a Bayesian perspective and an Independent Discussion of a clinical trial">Vlisides et al <meta property="datePublished" content="2019-01-22">2019</span></a><br/><a href="https://doi.org/10.1016/j.bja.2018.12.021"><span title="Depth of sedation as an interventional target to reduce postoperative delirium: mortality and functional outcomes of the Strategy to Reduce the Incidence of Postoperative Delirium in Elderly Patients randomised clinical trial">Sieber et al 2019</span></a> </p> </td> <td> <p> Anaesthesia </p> </td> <td> <p> Independent discussion </p> </td> <td> <p> 1 study </p> </td> </tr> <tr> <td> <p><a href="https://osf.io/7yt8u/ "><span title="Replication Oxley et al. (2008, Science)">Bakker et al <meta property="datePublished" content="2019-06-19">2019</span></a> </p> </td> <td> <p> Psychology </p> </td> <td> <p> Replication </p> </td> <td> <p> 1 paper </p> </td> </tr> <tr> <td> <p><a href="https://doi.org/10.1016/j.cels.2019.06.005 "><span title="A Multi-center Study on the Reproducibility of Drug-Response Assays in Mammalian Cell Lines">Niepel et al <meta property="datePublished" content="2019-07-10">2019</span></a> </p> </td> <td> <p> Cell pharmacology </p> </td> <td> <p> Robustness test </p> </td> <td> <p> 5 labs </p> </td> </tr> <tr> <td> <p><a href="https://arxiv.org/abs/1907.06902v1"><span title=" Are We Really Making Much Progress? A Worrying Analysis of Recent Neural Recommendation Approaches">Dacrema et al <meta property="datePublished" content="2019-07-16">2019</span></a> </p> </td> <td> <p> Machine learning </p> </td> <td> <p> Reproduction </p> </td> <td> <p> 18 conference papers </p> </td> </tr> <tr> <td> <p><a href="https://doi.org/10.1016/j.jasrep.2019.102002"><span title="Experimental replication shows knives manufactured from frozen human feces do not work">Eran et al <meta property="datePublished" content="2019-10-09">2019</span></a> </p> </td> <td> <p> Experimental archaeology </p> </td> <td> <p> Replication </p> </td> <td> <p> 1 theory </p> </td> </tr> <tr> <td> <p><a href="https://doi.org/10.1101/763730"><span title="Reproducible and Transparent Research Practices in Published Neurology Research">Rauh et al <meta property="datePublished" content="2019-09-16">2019</span></a> </p> </td> <td> <p> Neurology </p> </td> <td> <p> Census </p> </td> <td> <p> 202 papers </p> </td> </tr> <tr> <td> <p><a href="https://doi.org/10.31234/osf.io/dkg53"><span title="Failed pre-registered replication of mortality salience effects in traditional and novel measures">Sætrevik & Sjåstad <meta property="datePublished" content="2019-09-20">2019</span></a> </p> </td> <td> <p> Psychology </p> </td> <td> <p> Replication </p> </td> <td> <p> 2 experiments </p> </td> </tr> <tr> <td> <p><a href="https://doi.org/10.1038/s41559-019-0972-5"><span title="A checklist for maximizing reproducibility of ecological niche models">Feng et al. <meta property="datePublished" content="2019-09-23">2019</span></a> </p> </td> <td> <p> Ecology and Evolution </p> </td> <td> <p> Census </p> </td> <td> <p> 163 papers </p> </td> </tr> <tr> <td> <p><a href="https://doi.org/10.1101/843193"><span title="Variability in the analysis of a single neuroimaging dataset by many teams">Botvinik-Nezer et al. <meta property="datePublished" content="2019-11-15">2019</span></a> </p> </td> <td> <p> Neuroimaging </p> </td> <td> <p> Robustness test </p> </td> <td> <p> 1 data set, 70 teams </p> </td> </tr> <tr> <td> <p><a href="https://doi.org/10.31234/osf.io/vef2c"><span title="Many Labs 4: Failure to Replicate Mortality Salience Effect With and Without Original Author Involvement">Klein et al. <meta property="datePublished" content="2019-12-11">2019</span></a> </p> </td> <td> <p> Psychology </p> </td> <td> <p> <span title="Many Labs 4: Interpreting a failure to replicate is complicated by the fact that the failure could be due to the original finding being a false positive, unrecognized moderating influences between the original and replication procedures, or faulty implementation of the procedures in the replication. One strategy to maximize replication quality is involving the original authors in study design. We (N = 21 Labs and N = 2,220 participants) experimentally tested whether original author involvement improved replicability of a classic finding from Terror Management Theory (Greenberg et al., 1994). Our results were non-diagnostic of whether original author involvement improves replicability because we were unable to replicate the finding under any conditions. This suggests that the original finding was either a false positive or the conditions necessary to obtain it are not yet understood or no longer exist.">Replication</span> </p> </td> <td> <p> 1 experiment, 21 labs, 2,220 participants </p> </td> </tr> <tr> <td> <p><a href="https://doi.org/10.31234/osf.io/fk8vh"><span title="Analysis of Open Data and Computational Reproducibility in Registered Reports in Psychology">Obels et al. <meta property="datePublished" content="2019-05-23">2019</span></a> </p> </td> <td> <p> Psychology </p> </td> <td> <p> <span title="Ongoing technological developments have made it easier than ever before for scientists to share their data, materials, and analysis code. Sharing data and analysis code makes it easier for other researchers to re-use or check published research. However, these benefits will only emerge if researchers can reproduce the analysis reported in published articles and if data is annotated well enough so that it is clear what all variables mean. Because most researchers are not trained in computational reproducibility, it is important to evaluate current practices to identify practices that can be improved. We examined data and code sharing for Registered Reports published in the psychological literature between 2014 and 2018, and attempted to independently computationally reproduce the main results in each article. Of the main results from 62 articles that met our inclusion criteria, data were available for 41 articles, and analysis scripts for 37 articles. For the main results in 36 articles that shared both data and code we could run the scripts for 31 analyses, and reproduce the main results for 21 articles. Although the articles that shared both data and code (36 out of 62, or 58%) and articles for which main results could be computationally reproduced (21 out of 36, or 58%) was relatively high compared to other studies, there is clear room for improvement. We provide practical recommendations based on our observations and link to examples of good research practices in the papers we reproduced.">Census</span> </p> </td> <td> <p> 62 papers </p> </td> </tr> <tr> <td> <p> <a href="https://doi.org/10.1001/jamaoncol.2019.2564">Wayant et al <meta property="datePublished" content="2019-09-05">2019</a> </p> </td> <td> <p> <span title="Evaluation of Reproducible Research Practices in Oncology Systematic Reviews With Meta-analyses Referenced by National Comprehensive Cancer Network Guidelines">Oncology</span> </p> </td> <td> <p> Census </p> </td> <td> <p> 154 meta-analyses </p> </td> </tr> <tr> <td> <p><a href="https://doi.org/10.1101/2020.01.30.924092"><span title="Factorial study of the RNA-seq computational workflow identifies biases as technical gene signatures">Simoneau et al. <meta property="datePublished" content="2020-01-30">2020</span></a> </p> </td> <td> <p> Bioinformatics </p> </td> <td> <p> <span title="RNA-seq is a modular experimental and computational approach that aims in identifying and quantifying RNA molecules. The modularity of the RNA-seq technology enables adaptation of the protocol to develop new ways to explore RNA biology, but this modularity also brings forth the importance of methodological thoroughness. Liberty of approach comes with the responsibility of choices, and such choices must be informed. Here, we present an approach that identifies gene group specific quantification biases in currently used RNA-seq software and references by processing sequenced datasets using a wide variety of RNA-seq computational pipelined, and by decomposing these expression datasets using an independent component analysis matrix factorisation method. By exploring the RNA-seq pipeline using a systemic approach, we highlight the yet inadequately characterized central importance of genome annotations in quantification results. We also show that the different choices in RNA-seq methodology are not independent, through interactions between genome annotations and quantification software. Genes were mainly found to be affected by differences in their sequence, by overlapping genes and genes with similar sequence. Our approach offers an explanation for the observed biases by identifying the common features used differently by the software and references, therefore providing leads for the betterment of RNA-seq methodology.">Robustness test</span> </p> </td> <td> <p> 1 data set </p> </td> </tr> <tr> <td> <p><a href="https://doi.org/10.1186/s13041-020-0552-2"><span title="No raw data, no science: another possible source of the reproducibility crisis">Miyakawa <meta property="datePublished" content="2020-02-20">2020</span></a> </p> </td> <td> <p> Neurobiology </p> </td> <td> <p> <span title="A reproducibility crisis is a situation where many scientific studies cannot be reproduced. Inappropriate practices of science, such as HARKing, p-hacking, and selective reporting of positive results, have been suggested as causes of irreproducibility. In this editorial, I propose that a lack of raw data or data fabrication is another possible cause of irreproducibility. As an Editor-in-Chief of Molecular Brain, I have handled 180 manuscripts since early 2017 and have made 41 editorial decisions categorized as Revise before review, requesting that the authors provide raw data. Surprisingly, among those 41 manuscripts, 21 were withdrawn without providing raw data, indicating that requiring raw data drove away more than half of the manuscripts. I rejected 19 out of the remaining 20 manuscripts because of insufficient raw data. Thus, more than 97% of the 41 manuscripts did not present the raw data supporting their results when requested by an editor, suggesting a possibility that the raw data did not exist from the beginning, at least in some portions of these cases. Considering that any scientific study should be based on raw data, and that data storage space should no longer be a challenge, journals, in principle, should try to have their authors publicize raw data in a public database or journal site upon the publication of the paper to increase reproducibility of the published results and to increase public trust in science.">Census</span> </p> </td> <td> <p> 41 papers </p> </td> </tr><tr> <td> <p> <a href="https://doi.org/10.1371/journal.pone.0229578">Thelwall et al <meta property="datePublished" content="2020-02-21">2020</a> </p> </td> <td> <p> <span title="Is useful research data usually shared? An investigation of genome-wide association study summary statistics">Genetics</span> </p> </td> <td> <p> Census </p> </td> <td> <p> 1799 papers </p> </td> </tr> <tr> <td> <p> <a href="https://doi.org/10.1371/journal.pone.0233107">Maassen et al <meta property="datePublished" content="2020-05-27">2020</a> </p> </td> <td> <p> <span title="Reproducibility of individual effect sizes in meta-analyses in psychology">Psychology</span> </p> </td> <td> <p> Reproduction </p> </td> <td> <p> 33 meta-analyses </p> </td> </tr> <tr> <td> <p> <a href="https://doi.org/10.1101/2020.05.11.088021">Riedel et al <meta property="datePublished" content="2020-05-13">2020</a> </p> </td> <td> <p> <span title="ODDPub – a Text-Mining Algorithm to Detect Data Sharing in Biomedical Publications">Biomedical science</span> </p> </td> <td> <p> Census </p> </td> <td> <p> 792 papers </p> </td> </tr> <tr> <td> <p> <a href="https://doi.org/10.1371/journal.pbio.3000763">Culina et al <meta property="datePublished" content="2020-07-28">2020</a> </p> </td> <td> <p> <span title="Low availability of code in ecology: A call for urgent action">Ecology</span> </p> </td> <td> <p> Census </p> </td> <td> <p> 346 papers </p> </td> </tr> <tr> <td> <p> <a href="https://arxiv.org/abs/2008.05414">Clementi & Barba <meta property="datePublished" content="2020-08-12">2020</a> </p> </td> <td> <p> <span title="Reproducible Validation and Replication Studies in Nanoscale Physics">Physics</span> </p> </td> <td> <p> Replication </p> </td> <td> <p> 2 papers </p> </td> </tr> <tr> <td> <p> <a href="https://doi.org/10.1016/j.fertnstert.2020.05.020">Kemper et al <meta property="datePublished" content="2020-08-06">2020</a> </p> </td> <td> <p> <span title="Reproducible research practices and transparency in reproductive endocrinology and infertility articles">Reproductive endocrinology</span> </p> </td> <td> <p> Census </p> </td> <td> <p> 222 papers </p> </td> </tr> <tr> <td> <p> <a href="https://doi.org/10.7554/eLife.55133">Marqués et al <meta property="datePublished" content="2020-08-11">2020</a> </p> </td> <td> <p> <span title="Imaging methods are vastly underreported in biomedical research">Biomedical sciences</span> </p> </td> <td> <p> Census </p> </td> <td> <p> 240 papers </p> </td> </tr> <tr> <td> <p> <a href="https://doi.org/10.1016/j.envsoft.2020.104873">Janssen et al <meta property="datePublished" content="2020-09-16">2020</a> </p> </td> <td> <p> <span title="On code sharing and model documentation of published individual and agent-based models">Environmental Modelling</span> </p> </td> <td> <p> Census </p> </td> <td> <p> 7500 papers </p> </td> </tr> <tr> <td> <p> <a href="https://doi.org/10.1136/heartjnl-2020-316519">Anderson et al <meta property="datePublished" content="2020-08-21">2020</a> </p> </td> <td> <p> <span title="Evaluation of indicators supporting reproducibility and transparency within cardiology literature">Cardiology</span> </p> </td> <td> <p> Census </p> </td> <td> <p> 532 papers </p> </td> </tr> <tr> <td> <p> <a href="https://doi.org/10.4230/LIPIcs.GIScience.2021.II.2">Ostermann et al <meta property="datePublished" content="2021-09-14">2021</a> </p> </td> <td> <p> <span title="Reproducible Research and GIScience: an evaluation using GIScience conference papers">GIS</span> </p> </td> <td> <p> Census </p> </td> <td> <p> 75 papers </p> </td> </tr> <tr> <td> <p> <a href="https://doi.org/10.1101/581033">Samota & Davey <meta property="datePublished" content="2020-12-09">2020</a> </p> </td> <td> <p> <span title="Knowledge and attitudes among life scientists towards reproducibility within journal articles: a research survey">Life Sciences</span> </p> </td> <td> <p> Survey </p> </td> <td> <p> 251 researchers </p> </td> </tr> <tr> <td> <p> <a href="https://doi.org/10.1101/2020.10.27.20220939">Bedford & Tzovaras <meta property="datePublished" content="2020-10-29">2020</a> </p> </td> <td> <p> <span title="Re-analysis of genetic risks for Chronic Fatigue Syndrome from 23andMe data finds few remain">Genetics</span> </p> </td> <td> <p> Robustness test </p> </td> <td> <p> 1 paper </p> </td> </tr> <tr> <td> <p> <a href="https://doi.org/10.3389/fgene.2020.610798">Krassowski et al <meta property="datePublished" content="2020-12-10">2020</a> <a href="https://github.com/krassowski/multi-omics-state-of-the-field">(repo)</a> </p> </td> <td> <p> <span title="State of the Field in Multi-Omics Research: From Computational Needs to Data Mining and Sharing">Life Sciences</span> </p> </td> <td> <p> Census </p> </td> <td> <p> 3377 articles </p> </td> </tr> <tr> <td> <p> <a href="https://doi.org/10.1371/journal.pcbi.1008725">Boudreau et al <meta property="datePublished" content="2021-02-11">2021</a> </p> </td> <td> <p> <span title="On the open-source landscape of PLOS Computational Biology">Computational Biology</span> </p> </td> <td> <p> Census </p> </td> <td> <p> 622 papers </p> </td> </tr> <tr> <td> <p> <a href="https://arxiv.org/abs/2104.07651">Heumos et al <meta property="datePublished" content="2021-04-15">2021</a> </p> </td> <td> <p> <span title="mlf-core: a framework for deterministic machine learning">Computational Biology</span> </p> </td> <td> <p> Robustness test </p> </td> <td> <p> 6 studies </p> </td> </tr> <tr> <td> <p> <a href="https://doi.org/10.31219/osf.io/f73a6">Hrynaszkiewicz et al <meta property="datePublished" content="2021-04-28">2021</a> </p> </td> <td> <p> <span title="A survey of code sharing practice and policy in computational biology">Computational Biology</span> </p> </td> <td> <p> Survey </p> </td> <td> <p> 214 researchers </p> </td> </tr> <tr> <td> <p> <a href="https://doi.org/10.1101/2021.01.04.424681">Päll et al <meta property="datePublished" content="2021-01-04">2021</a> </p> </td> <td> <p> <span title="A field-wide assessment of differential high throughput sequencing reveals widespread bias">Differential expression</span> </p> </td> <td> <p> Census </p> </td> <td> <p> 2109 GEO submissions </p> </td> </tr> <tr> <td> <p> <a href="https://doi.org/10.1101/2021.09.06.459114">Wijesooriya et al <meta property="datePublished" content="2021-09-08">2021</a> </p> </td> <td> <p> <span title="Guidelines for reliable and reproducible functional enrichment analysis">Computational biology</span> </p> </td> <td> <p> Census </p> </td> <td> <p> 186 papers </p> </td> </tr> <tr> <td> <p> <a href="https://doi.org/10.31234/osf.io/c8a5z">Weisberg et al <meta property="datePublished" content="2021-09-06">2021</a> </p> </td> <td> <p> <span title="Evaluating the Effects of a Programming Error on a Virtual Environment Measure of Spatial Navigation Behavior">Psychology</span> </p> </td> <td> <p> Robustness test </p> </td> <td> <p> 1 study </p> </td> </tr> <tr> <td> <p> <a href="https://doi.org/10.1101/2021.04.12.439408">Vanderaa & Gatto <meta property="datePublished" content="2021-04-12">2021</a> </p> </td> <td> <p> <span title="Replication of single-cell proteomics data reveals important computational challenges">Proteomics</span> </p> </td> <td> <p> Refactor </p> </td> <td> <p> 1 analysis </p> </td> </tr> <tr> <td> <p> <a href="https://doi.org/10.31222/osf.io/cd5j9">Breznau et al <meta property="datePublished" content="2021-03-24">2021</a> </p> </td> <td> <p> <span title="Observing Many Researchers Using the Same Data and Hypothesis Reveals a Hidden Universe of Uncertainty">Social science</span> </p> </td> <td> <p> Robustness test </p> </td> <td> <p> 73 teams </p> </td> </tr> <tr> <td> <p> <a href="https://doi.org/10.1038/s42256-021-00307-0">Roberts et al <meta property="datePublished" content="2021-03-15">2021</a> </p> </td> <td> <p> <span title="Common pitfalls and recommendations for using machine learning to detect and prognosticate for COVID-19 using chest radiographs and CT scans">Radiology</span> </p> </td> <td> <p> Census </p> </td> <td> <p> 62 studies </p> </td> </tr> <tr> <td> <p> <a href="https://doi.org/10.1126/scitranslmed.abb1655">McDermott et al <meta property="datePublished" content="2021-03-24">2021</a> </p> </td> <td> <p> <span title="Reproducibility in machine learning for health research: Still a ways to go">Clinical ML</span> </p> </td> <td> <p> Census </p> </td> <td> <p> 511 papers </p> </td> </tr> <tr> <td> <p> <a href="https://doi.org/10.1038/s41597-021-00981-0">Tedersoo et al <meta property="datePublished" content="2021-07-27">2021</a> </p> </td> <td> <p> <span title="Data sharing practices and data availability upon request differ across scientific disciplines">9 Fields</span> </p> </td> <td> <p> Census </p> </td> <td> <p> 875 articles </p> </td> </tr> <tr> <td> <p> <a href="https://doi.org/10.1016/j.jclinepi.2022.05.019">Gabelica et al <meta property="datePublished" content="2022-05-30">2022</a> </p> </td> <td> <p> <span title="Many researchers were not compliant with their published data sharing statement: mixed-methods study">Life Sciences</span> </p> </td> <td> <p> Census </p> </td> <td> <p> 3556 papers </p> </td> </tr> <tr> <td> <p> <a href="https://arxiv.org/abs/2209.04308">Samuel & Mietchen <meta property="datePublished" content="2022-09-09">2022</a> </p> </td> <td> <p> <span title="Computational reproducibility of Jupyter notebooks from biomedical publications">Biomedical Sciences</span> </p> </td> <td> <p> Census </p> </td> <td> <p> 9625 Jupyter notebooks </p> </td> </tr> <tr> <td> <p> <a href="https://doi.org/10.1002/ijc.33946">Zaorsky et al <meta property="datePublished" content="2022-01-31">2022</a> </p> </td> <td> <p> <span title="Retrospective comparative effectiveness research: Will changing the analytical methods change the results?">Radiation oncology</span> </p> </td> <td> <p> Robustness test </p> </td> <td> <p> 300k models </p> </td> </tr> <tr> <td> <p> <a href="https://doi.org/10.31222/osf.io/sjyp3">Kohrt et al <meta property="datePublished" content="2022-10-07">2022</a> </p> </td> <td> <p> <span title="Replication of the natural selection of bad science">Behavioral sciences</span> </p> </td> <td> <p> Refactor </p> </td> <td> <p> <a href="https://doi.org/10.1098/rsos.160384">One study</a> </p> </td> </tr> <tr> <td> <p> <a href="https://doi.org/10.1186/s12916-022-02644-2">Hamilton et al <meta property="datePublished" content="2022-11-09">2022</a> </p> </td> <td> <p> <span title="How often do cancer researchers make their data and code available and what factors are associated with sharing?">Cancer biology</span> </p> </td> <td> <p> Census </p> </td> <td> <p> 306 papers </p> </td> </tr> <tr> <td> <p> <a href="https://doi.org/10.3389/fcomm.2022.1048896">Motoki & Iseki <meta property="datePublished" content="2022-10-28">2022</a> </p> </td> <td> <p> <span title="Evaluating replicability of ten influential research on sensory marketing">Marketing</span> </p> </td> <td> <p> Replication </p> </td> <td> <p> 10 papers </p> </td> </tr> <tr> <td> <p> <a href="https://doi.org/10.1128/mbio.01607-23">Gihawi et al <meta property="datePublished" content="2023-10-09">2023</a> </p> </td> <td> <p> <span title="Major data analysis errors invalidate cancer microbiome findings">Bioinformatics</span> </p> </td> <td> <p> Refactor </p> </td> <td> <p> 1 paper </p> </td> </tr> <tr> <td> <p> <a href="https://doi.org/10.32942/X2GG62">Gould et al <meta property="datePublished" content="2023-10-03">2023</a> </p> </td> <td> <p> <span title="Same data, different analysts: variation in effect sizes due to analytical decisions in ecology and evolutionary biology">Ecology</span> </p> </td> <td> <p> Robustness test </p> </td> <td> <p> 2 datasets, 174 teams </p> </td> </tr> <tr> <td> <p> <a href="https://doi.org/10.1038/s41562-023-01749-9">Protzko et al <meta property="datePublished" content="2023-11-09">2023</a> </p> </td> <td> <p> <span title="High replicability of newly discovered social-behavioural findings is achievable">Psychology</span> </p> </td> <td> <p> Replication </p> </td> <td> <p> 16 findings </p> </td> </tr> <tr> <td> <p> <a href="https://doi.org/10.5070/g6011239">Bochynska et al <meta property="datePublished" content="2023-11-09">2023</a> </p> </td> <td> <p> <span title="Reproducible research practices and transparency across linguistics">Linguistics</span> </p> </td> <td> <p> Census </p> </td> <td> <p> 600 articles </p> </td> </tr> <tr> <td> <p> <a href="https://doi.org/10.1371/journal.pone.0300333">Kambouris et al <meta property="datePublished" content="2024-03-13">2024</a> </p> </td> <td> <p> <span title="Computationally reproducing results from meta-analyses in ecology and evolutionary biology using shared code and data">Ecology</span> </p> </td> <td> <p> Census </p> </td> <td> <p> 177 papers </p> </td> </tr> <tr> <td> <p> <a href="https://doi.org/10.1371/journal.pbio.3002562">Standvoss et al <meta property="datePublished" content="2024-04-02">2024</a> </p> </td> <td> <p> <span title="Shortcut citations in the methods section: Frequency, problems, and strategies for responsible reuse">Biology</span> </p> </td> <td> <p> Census </p> </td> <td> <p> 750 papers </p> </td> </tr> <tr> <td> <p> <a href="https://doi.org/http://dx.doi.org/10.2139/ssrn.4790780">Brodeur et al <meta property="datePublished" content="2024-04-11">2024</a> </p> </td> <td> <p> <span title="Mass Reproducibility and Replicability: A New Hope">Economics</span> </p> </td> <td> <p> Robustness test </p> </td> <td> <p> 110 papers </p> </td> </tr> <tr> <td> <p> <a href="https://doi.org/10.1038/s41597-024-03786-z">Hughes-Noehrer et al <meta property="datePublished" content="2024-08-23">2024</a> </p> </td> <td> <p> <span title="UK Reproducibility Network open and transparent research practices survey dataset">Research</span> </p> </td> <td> <p> Survey </p> </td> <td> <p> 2500 researchers </p> </td> </tr> <!--study_placeholder--> </tbody> </table>

Ad-hoc reproductions

These are one-off unpublished attempts to reproduce individual studies

<table id="ad_hoc"> <tbody> <tr> <td> <p> Reproduction </p> </td> <td> <p> Original study </p> </td> </tr> <tr> <td> <p> https://rdoodles.rbind.io/2019/06/reanalyzing-data-from-human-gut-microbiota-from-autism-spectrum-disorder-promote-behavioral-symptoms-in-mice/ and https://notstatschat.rbind.io/2019/06/16/analysing-the-mouse-autism-data/ </p> </td> <td> <p> Sharon, G. et al. Human Gut Microbiota from Autism Spectrum Disorder Promote Behavioral Symptoms in Mice. Cell 2019, 177 (6), 1600–1618.e17. </p> </td> </tr> <tr> <td> <p> https://github.com/sean-harrison-bristol/CCR5_replication </p> </td> <td> <p> Wei, X.; Nielsen, R. CCR5-∆32 Is Deleterious in the Homozygous State in Humans. Nat. Med. 2019 DOI: 10.1038/s41591-019-0459-6. (retracted) </p> </td> </tr> <tr> <td> <p> https://github.com/leipzig/placenta </p> </td> <td> <p> Leiby et al "Lack of detection of a human placenta microbiome in samples from preterm and term deliveries" https://doi.org/10.1186/s40168-018-0575-4 </p> </td> </tr> <tr> <td> <p> <a href="https://www.dropbox.com/s/mpdgpraal1sqhtv/CTADposter_LP105A_RigorReplication_Simufilam_SF2022.pdf?dl=0">Heilbut et al "Rigor and Replication in Alzheimer’s Therapeutic Development: A Case Study"</a> </p> </td> <td> <p> <a href="https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0266627">Wang et al "Retraction: High-Affinity Naloxone Binding to Filamin A Prevents Mu Opioid Receptor–Gs Coupling Underlying Opioid Tolerance and Dependence"</a> </p> </td> </tr> </tbody> </table>

Theory papers

<table> <tbody> <tr> <td> <p> Authors/Date </p> </td> <td> <p> Title </p> </td> <td> <p> Field </p> </td> <td> <p> Type </p> </td> </tr> <tr> <td> <p> <a href="https://doi.org/10.1080/09332480.2019.1579573">Ioannidis <meta property="datePublished" content="2005-08-30">2005</a> </p> </td> <td> <p> <span title="There is increasing concern that most current published research findings are false. The probability that a research claim is true may depend on study power and bias, the number of other studies on the same question, and, importantly, the ratio of true to no relationships among the relationships probed in each scientific field. In this framework, a research finding is less likely to be true when the studies conducted in a field are smaller; when effect sizes are smaller; when there is a greater number and lesser preselection of tested relationships; where there is greater flexibility in designs, definitions, outcomes, and analytical modes; when there is greater financial and other interest and prejudice; and when more teams are involved in a scientific field in chase of statistical significance. Simulations show that for most study designs and settings, it is more likely for a research claim to be false than true. Moreover, for many current scientific fields, claimed research findings may often be simply accurate measures of the prevailing bias. In this essay, I discuss the implications of these problems for the conduct and interpretation of research.">Why most published research findings are false</span> </p> <td> <p> Science </p> </td> <td> <p> Statistical reproducibility </p> </td> </td> </tr> <tr> <td> <p> <a href="https://doi.org/10.1371/journal.pcbi.1000424">Noble <meta property="datePublished" content="2009-07-31">2005</a> </p> </td> <td> <p> <span title="Most bioinformatics coursework focuses on algorithms, with perhaps some components devoted to learning programming skills and learning how to use existing bioinformatics software. Unfortunately, for students who are preparing for a research career, this type of curriculum fails to address many of the day-to-day organizational challenges associated with performing computational experiments. In practice, the principles behind organizing and documenting computational experiments are often learned on the fly, and this learning is strongly influenced by personal predilections as well as by chance interactions with collaborators or colleagues. The purpose of this article is to describe one good strategy for carrying out computational experiments. I will not describe profound issues such as how to formulate hypotheses, design experiments, or draw conclusions. Rather, I will focus on relatively mundane issues such as organizing files and directories and documenting progress. These issues are important because poor organizational choices can lead to significantly slower research progress. I do not claim that the strategies I outline here are optimal. These are simply the principles and practices that I have developed over 12 years of bioinformatics research, augmented with various suggestions from other researchers with whom I have discussed these issues.">A Quick Guide to Organizing Computational Biology Projects</span> </p> <td> <p> Bioinformatics </p> </td> <td> <p> Best practices </p> </td> </td> </tr> <tr> <td> <p> <a href="https://doi.org/10.1371/journal.pcbi.1000424">Sandve et al <meta property="datePublished" content="2013-10-24">2013</a> </p> </td> <td> <p> <span title="Replication is the cornerstone of a cumulative science [1]. However, new tools and technologies, massive amounts of data, interdisciplinary approaches, and the complexity of the questions being asked are complicating replication efforts, as are increased pressures on scientists to advance their research [2]. As full replication of studies on independently collected data is often not feasible, there has recently been a call for reproducible research as an attainable minimum standard for assessing the value of scientific claims [3]. This requires that papers in experimental science describe the results and provide a sufficiently clear protocol to allow successful repetition and extension of analyses based on original data [4].">Ten Simple Rules for Reproducible Computational Research</span> </p> <td> <p> Computational science </p> </td> <td> <p> Best practices </p> </td> </td> </tr> <tr> <td> <p> <a href="https://doi.org/10.1371/journal.pbio.1002165"><span title="The Economics of Reproducibility in Preclinical Research">Freedman et al <meta property="datePublished" content="2015-06-09">2015</span></a> </p> </td> <td> <p> <span title="Abstract">The Economics of Reproducibility in Preclinical Research</span> </p> </td> <td> <p> Preclinical research </p> </td> <td> <p> Best practices </p> </td> </tr> <tr> <td> <p> <a href="https://doi.org/10.31234/osf.io/jqw35">Yarkoni <meta property="datePublished" content="2019-11-21">2019</a> </p> </td> <td> <p> <span title="Most theories and hypotheses in psychology are verbal in nature, yet their evaluation overwhelmingly relies on inferential statistical procedures. The validity of the move from qualitative to quantitative analysis depends on the verbal and statistical expressions of a hypothesis being closely aligned—that is, that the two must refer to roughly the same set of hypothetical observations. Here I argue that most inferential statistical tests in psychology fail to meet this basic condition. I demonstrate how foundational assumptions of the 'random effects' model used pervasively in psychology impose far stronger constraints on the generalizability of results than most researchers appreciate. Ignoring these constraints dramatically inflates false positive rates and routinely leads researchers to draw sweeping verbal generalizations that lack any meaningful connection to the statistical quantities they are putatively based on. I argue that the routine failure to consider the generalizability of one's conclusions from a statistical perspective lies at the root of many of psychology's ongoing problems (e.g., the replication crisis), and conclude with a discussion of several potential avenues for improvement.">The Generalizability Crisis</span> </p> </td> <td> <p> Psychology </p> </td> <td> <p> Statistical reproducibility </p> </td> </tr> <tr> <td> <p> <a href="http://proceedings.mlr.press/v97/bouthillier19a.html">Bouthillier et al <meta property="datePublished" content="2019-06-09">2019</a> </p> </td> <td> <p> <span title="The apparent contradiction in the title is a wordplay on the different meanings attributed to the word reproducible across different scientific fields. What we imply is that unreproducible findings can be built upon reproducible methods. Without denying the importance of facilitating the reproduction of methods, we deem important to reassert that reproduction of findings is a fundamental step of the scientific inquiry. We argue that the commendable quest towards easy deterministic reproducibility of methods and numerical results should not have us forget the even more important necessity of ensuring the reproducibility of empirical findings and conclusions by properly accounting for essential sources of variations. We provide experiments to exemplify the brittleness of current common practice in the evaluation of models in the field of deep learning, showing that even if the results could be reproduced, a slightly different experiment would not support the findings. We hope to help clarify the distinction between exploratory and empirical research in the field of deep learning and believe more energy should be devoted to proper empirical research in our community. This work is an attempt to promote the use of more rigorous and diversified methodologies. It is not an attempt to impose a new methodology and it is not a critique on the nature of exploratory research.">Unreproducible Research is Reproducible</span> </p> </td> <td> <p> Machine Learning </p> </td> <td> <p> Methodology </p> </td> </tr> <tr> <td> <p><a href="https://doi.org/10.1038/s41567-019-0780-5">Milton & Possolo <meta property="datePublished" content="2019-06-09">2019</a> </td><td> <p><span title="Lack of reproducibility is not necessarily bad news; it may herald new discoveries and signal scientifc progress">Trustworthy data underpin reproducible research</span></p> </td> <td> <p> Physics </p> </td> <td> <p> Scientific philosophy </p> </td> </tr> <tr> <td> <p> <a href="https://doi.org/10.1371/journal.pone.0216125"><span title="Scientific discovery in a model-centric framework: Reproducibility, innovation, and epistemic diversity">Devezer et al <meta property="datePublished" content="2019-05-15">2019</span></a> </p> </td> <td> <p> <span title="Consistent confirmations obtained independently of each other lend credibility to a scientific result. We refer to results satisfying this consistency as reproducible and assume that reproducibility is a desirable property of scientific discovery. Yet seemingly science also progresses despite irreproducible results, indicating that the relationship between reproducibility and other desirable properties of scientific discovery is not well understood. These properties include early discovery of truth, persistence on truth once it is discovered, and time spent on truth in a long-term scientific inquiry. We build a mathematical model of scientific discovery that presents a viable framework to study its desirable properties including reproducibility. In this framework, we assume that scientists adopt a model-centric approach to discover the true model generating data in a stochastic process of scientific discovery. We analyze the properties of this process using Markov chain theory, Monte Carlo methods, and agent-based modeling. We show that the scientific process may not converge to truth even if scientific results are reproducible and that irreproducible results do not necessarily imply untrue results. The proportion of different research strategies represented in the scientific population, scientists’ choice of methodology, the complexity of truth, and the strength of signal contribute to this counter-intuitive finding. Important insights include that innovative research speeds up the discovery of scientific truth by facilitating the exploration of model space and epistemic diversity optimizes across desirable properties of scientific discovery.">Scientific discovery in a model-centric framework: Reproducibility, innovation, and epistemic diversity</span> </p> </td> <td> <p> Science </p> </td> <td> <p> Statistical reproducibility </p> </td> </tr> <tr> <td> <p> <a href="https://arxiv.org/abs/2002.11626"><span title="A Realistic Guide to Making Data Available Alongside Code to Improve Reproducibility">Tierney et al <meta property="datePublished" content="2020-02-06">2020</span></a> </p> </td> <td> <p> <span title="Data makes science possible. Sharing data improves visibility, and makes the research process transparent. This increases trust in the work, and allows for independent reproduction of results. However, a large proportion of data from published research is often only available to the original authors. Despite the obvious benefits of sharing data, and scientists' advocating for the importance of sharing data, most advice on sharing data discusses its broader benefits, rather than the practical considerations of sharing. This paper provides practical, actionable advice on how to actually share data alongside research. The key message is sharing data falls on a continuum, and entering it should come with minimal barriers.">A Realistic Guide to Making Data Available Alongside Code to Improve Reproducibility</span> </p> </td> <td> <p> Science </p> </td> <td> <p> Best practices </p> </td> </tr> <tr> <td> <p> <a href="https://arxiv.org/abs/2003.00898"><span title="The importance of transparency and reproducibility in artificial intelligence research">Haibe-Kains et al <meta property="datePublished" content="2020-02-28">2020</span></a> </p> </td> <td> <p> <span title="In their study, McKinney et al. showed the high potential of artificial intelligence for breast cancer screening. However, the lack of detailed methods and computer code undermines its scientific value. We identify obstacles hindering transparent and reproducible AI research as faced by McKinney et al and provide solutions with implications for the broader field.">The importance of transparency and reproducibility in artificial intelligence research</span> </p> </td> <td> <p> Artificial Intelligence </p> </td> <td> <p> Critique </p> </td> </tr> <tr> <td> <p> <a href="https://doi.org/10.1371/journal.pbio.3000691"><span title="What is replication?">Nosek & Errington <meta property="datePublished" content="2020-03-27">2020</span></a> </p> </td> <td> <p> <span title="None">What is replication?</span> </p> </td> <td> <p> Science </p> </td> <td> <p> Scientific philosophy </p> </td> </tr> <tr> <td> <p> <a href="https://ecoevorxiv.org/h5r6n/"><span title="A Beginner’s Guide to Conducting Reproducible Research">Alston & Rick <meta property="datePublished" content="2020-05-03">2020</span></a> </p> </td> <td> <p> <span title="Reproducible research is widely acknowledged as an important tool for improving science and reducing harm from the \"replication crisis\", yet research in most fields within biology remains largely irreproducible. In this article, we make the case for why all research should be reproducible, explain why research is often not reproducible, and offer a simple framework that researchers can use to make their research more reproducible. Researchers can increase the reproducibility of their work by improving data management practices, writing more readable code, and increasing use of the many available platforms for sharing data and code. While reproducible research is often associated with a set of advanced tools for sharing data and code, reproducibility is just as much about maintaining work habits that are already widely acknowledged as best practices for research. Increasing reproducibility will increase rigor, trustworthiness, and transparency while benefiting both practitioners of reproducible research and their fellow researchers.">A Beginner’s Guide to Conducting Reproducible Research</span> </p> </td> <td> <p> Ecology </p> </td> <td> <p> Best Practices </p> </td> </tr> <tr> <td> <p> <a href="https://doi.org/10.20944/preprints202006.0002.v1"><span title="Realistic and Robust Reproducible Research for Biostatistics">Hejblum et al <meta property="datePublished" content="2020-06-03">2020</span></a> </p> </td> <td> <p> <span title="&lt;jats:p&gt;The complexity of analysis pipelines in biomedical sciences poses a severe challenge for the transparency and reproducibility of results. Researchers are increasingly incorporating software development technologies and methods into their analyses, but this is a quickly evolving landscape and teams may lack the capabilities to set up their own complex IT infrastructure to aid reproducibility. Basing a reproducible research strategy on readily available solutions with zero or low set-up costs whilst maintaining technological flexibility to incorporate domain-specific software tools is therefore of key importance. We outline a practical approach for robust reproducibility of analysis results. In our examples, we rely exclusively on established open-source tools and free services. Special emphasis is put on the integration of these tools with best practices from software development and free online services for the biostatistics domain.&lt;/jats:p&gt;">Realistic and Robust Reproducible Research for Biostatistics</span> </p> </td> <td> <p> Biostatistics </p> </td> <td> <p> Best practices </p> </td> </tr> <tr> <td> <p> <a href="https://doi.org/10.1007/s13222-019-00317-8"><span title="A Link is doi.org/10.1007/s13222-019-00317-8 – Reproducibility of Data">Pawlik et al <meta property="datePublished" content="2019-06-13">2019</span></a> </p> </td> <td> <p> <span title="Abstract">A Link is not Enough – Reproducibility of Data</span> </p> </td> <td> <p> Databases </p> </td> <td> <p> Best practices </p> </td> </tr> <tr> <td> <p> <a href="https://doi.org/10.1038/s41597-020-0524-5"><span title="COVID-19 pandemic reveals the peril of ignoring metadata standards">Schriml et al <meta property="datePublished" content="2020-06-19">2020</span></a> </p> </td> <td> <p> <span title="Abstract">COVID-19 pandemic reveals the peril of ignoring metadata standards</span> </p> </td> <td> <p> Virology </p> </td> <td> <p> Critique </p> </td> </tr> <tr> <td> <p> <a href="https://arxiv.org/abs/2007.08708"><span title="Principles for data analysis workflows">Stoudt et al <meta property="datePublished" content="2020-07-17">2020</span></a> </p> </td> <td> <p> <span title="Traditional data science education often omits training on research workflows: the process that moves a scientific investigation from raw data to coherent research question to insightful contribution. In this paper, we elaborate basic principles of a reproducible data analysis workflow by defining three phases: the Exploratory, Refinement, and Polishing Phases. Each workflow phase is roughly centered around the audience to whom research decisions, methodologies, and results are being immediately communicated. Importantly, each phase can also give rise to a number of research products beyond traditional academic publications. Where relevant, we draw analogies between principles for data-intensive research workflows and established practice in software development. The guidance provided here is not intended to be a strict rulebook; rather, the suggestions for practices and tools to advance reproducible, sound data-intensive analysis may furnish support for both students and current professionals.">Principles for data analysis workflows</span> </p> </td> <td> <p> Data science </p> </td> <td> <p> Best practices </p> </td> </tr> <tr> <td> <p> <a href="https://arxiv.org/abs/2007.12210"><span title="Reproducible Research: A Retrospective">Peng & Hicks <meta property="datePublished" content="2020-07-23">2020</span></a> </p> </td> <td> <p> <span title="Rapid advances in computing technology over the past few decades have spurred two extraordinary phenomena in science: large-scale and high-throughput data collection coupled with the creation and implementation of complex statistical algorithms for data analysis. Together, these two phenomena have brought about tremendous advances in scientific discovery but have also raised two serious concerns, one relatively new and one quite familiar. The complexity of modern data analyses raises questions about the reproducibility of the analyses, meaning the ability of independent analysts to re-create the results claimed by the original authors using the original data and analysis techniques. While seemingly a straightforward concept, reproducibility of analyses is typically thwarted by the lack of availability of the data and computer code that were used in the analyses. A much more general concern is the replicability of scientific findings, which concerns the frequency with which scientific claims are confirmed by completely independent investigations. While the concepts of reproduciblity and replicability are related, it is worth noting that they are focused on quite different goals and address different aspects of scientific progress. In this review, we will discuss the origins of reproducible research, characterize the current status of reproduciblity in public health research, and connect reproduciblity to current concerns about replicability of scientific findings. Finally, we describe a path forward for improving both the reproducibility and replicability of public health research in the future.">Reproducible Research: A Retrospective</span> </p> </td> <td> <p> Public health </p> </td> <td> <p> Review </p> </td> </tr> <tr> <td> <p> <a href="https://doi.org/10.1101/2020.06.30.178673"><span title="Streamlining Data-Intensive Biology With Workflow Systems">Reiter et al <meta property="datePublished" content="2020-07-02">2020</span></a> </p> </td> <td> <p> <span title="&lt;jats:title&gt;Abstract&lt;/jats:title&gt;&lt;jats:p&gt;As the scale of biological data generation has increased, the bottleneck of research has shifted from data generation to analysis. Researchers commonly need to build computational workflows that include multiple analytic tools and require incremental development as experimental insights demand tool and parameter modifications. These workflows can produce hundreds to thousands of intermediate files and results that must be integrated for biological insight. The maturation of data-centric workflow systems that internally manage computational resources, software, and conditional execution of analysis steps are reshaping the landscape of biological data analysis, and empowering researchers to conduct reproducible analyses at scale. Adoption of these tools can facilitate and expedite robust data analysis, but knowledge of these techniques is still lacking. Here, we provide a series of practices and strategies for leveraging workflow systems with structured project, data, and resource management to streamline large-scale biological analysis.&lt;/jats:p&gt;&lt;jats:sec&gt;&lt;jats:title&gt;Author Summary&lt;/jats:title&gt;&lt;jats:p&gt;We present a guide for workflow-enabled biological sequence data analysis, developed through our own teaching, training and analysis projects. We recognize that this is based on our own use cases and experiences, but we hope that our guide will contribute to a larger discussion within the open source and open science communities and lead to more comprehensive resources. Our main goal is to accelerate the research of scientists conducting sequence analyses by introducing them to organized workflow practices that not only benefit their own research but also facilitate open and reproducible science.&lt;/jats:p&gt;&lt;/jats:sec&gt;">Streamlining Data-Intensive Biology With Workflow Systems</span> </p> </td> <td> <p> Biology </p> </td> <td> <p> Best practices </p> </td> </tr> <tr> <td> <p> <a href="https://doi.org/10.7554/eLife.58237"><span title="Meta Research: Questionable research practices may have little effect on replicability">Ulrich & Miller <meta property="datePublished" content="2020-09-15">2020</span></a> </p> </td> <td> <p> <span title="&lt;jats:p&gt;This article examines why many studies fail to replicate statistically significant published results. We address this issue within a general statistical framework that also allows us to include various questionable research practices (QRPs) that are thought to reduce replicability. The analyses indicate that the base rate of true effects is the major factor that determines the replication rate of scientific results. Specifically, for purely statistical reasons, replicability is low in research domains where true effects are rare (e.g., search for effective drugs in pharmacology). This point is under-appreciated in current scientific and media discussions of replicability, which often attribute poor replicability mainly to QRPs.&lt;/jats:p&gt;">Meta Research: Questionable research practices may have little effect on replicability</span> </p> </td> <td> <p> Science </p> </td> <td> <p> Statistical reproducibility </p> </td> </tr> <tr> <td> <p> <a href="https://doi.org/10.1371/journal.pbio.3000999"><span title="We need to keep a reproducible trace of facts, predictions, and hypotheses from gene to function in the era of big data">Kasif & Roberts <meta property="datePublished" content="2020-11-30">2020</span></a> </p> </td> <td> <p> <span title="&lt;jats:p&gt;How do we scale biological science to the demand of next generation biology and medicine to keep track of the facts, predictions, and hypotheses? These days, enormous amounts of DNA sequence and other omics data are generated. Since these data contain the blueprint for life, it is imperative that we interpret it accurately. The abundance of DNA is only one part of the challenge. Artificial Intelligence (AI) and network methods routinely build on large screens, single cell technologies, proteomics, and other modalities to infer or predict biological functions and phenotypes associated with proteins, pathways, and organisms. As a first step, how do we systematically trace the provenance of knowledge from experimental ground truth to gene function predictions and annotations? Here, we review the main challenges in tracking the evolution of biological knowledge and propose several specific solutions to provenance and computational tracing of evidence in functional linkage networks.&lt;/jats:p&gt;">We need to keep a reproducible trace of facts, predictions, and hypotheses from gene to function in the era of big data</span> </p> </td> <td> <p> Functional genomics </p> </td> <td> <p> Critique </p> </td> </tr> <tr> <td> <p> <a href="https://doi.org/10.1093/gigascience/giab015"><span title="A research parasite's perspective on establishing a baseline to avoid errors in secondary analyses">Raman <meta property="datePublished" content="2021-03-12">2021</span></a> </p> </td> <td> <p> <span title="&lt;jats:title&gt;Abstract&lt;/jats:title&gt; &lt;jats:p&gt;To enhance reproducibility in scientific research, more and more datasets are becoming publicly available so that researchers can perform secondary analyses to investigate questions the original scientists had not posited. This increases the return on investment for the NIH and other funding bodies. These datasets, however, are not perfect, and a better understanding of the assumptions that shaped them is required. The 2020 Junior Research Parasite Award recognized our work that showed that the signal-to-noise ratio in a particular dataset had not been investigated, leading to an erroneous conclusion in the original research. In this commentary, I share the process that led to the identification of the problem and hopefully provide useful lessons for other research parasites.&lt;/jats:p&gt;">A research parasite&#x27;s perspective on establishing a baseline to avoid errors in secondary analyses</span> </p> </td> <td> <p> Science </p> </td> <td> <p> Best practices </p> </td> </tr> <tr> <td> <p> <a href="https://doi.org/10.1098/rsos.201925"><span title="The multiplicity of analysis strategies jeopardizes replicability: lessons learned across disciplines">Hoffmann et al <meta property="datePublished" content="2021-04-21">2021</span></a> </p> </td> <td> <p> <span title="&lt;jats:p&gt;For a given research question, there are usually a large variety of possible analysis strategies acceptable according to the scientific standards of the field, and there are concerns that this multiplicity of analysis strategies plays an important role in the non-replicability of research findings. Here, we define a general framework on common sources of uncertainty arising in computational analyses that lead to this multiplicity, and apply this framework within an overview of approaches proposed across disciplines to address the issue. Armed with this framework, and a set of recommendations derived therefrom, researchers will be able to recognize strategies applicable to their field and use them to generate findings more likely to be replicated in future studies, ultimately improving the credibility of the scientific process.&lt;/jats:p&gt;">The multiplicity of analysis strategies jeopardizes replicability: lessons learned across disciplines</span> </p> </td> <td> <p> Science </p> </td> <td> <p> Critique </p> </td> </tr> <tr> <td> <p> <a href="https://doi.org/10.1061/%28ASCE%29WR.1943-5452.0001368"><span title="Reproducible Results Policy">Rosenberg et al <meta property="datePublished" content="2020-11-30">2020</span></a> </p> </td> <td> <p> <span title="Abstract">Reproducible Results Policy</span> </p> </td> <td> <p> Water Resources </p> </td> <td> <p> Policy </p> </td> </tr> <tr> <td> <p> <a href="https://zenodo.org/record/6797657"><span title="10 Things for Curating Reproducible and FAIR Research">Clary et al <meta property="datePublished" content="2022-06-27">2022</span></a> </p> </td> <td> <p> <span title="This document, 10 Things for Curating Reproducible and FAIR Research, describes the key issues of curating reproducible and FAIR research (CURE-FAIR). It lists standards-based guidelines for ten practices, focusing primarily on research compendia produced by quantitative data-driven social science.The 10 CURE-FAIR Things are intended primarily for data curators and information professionals who are charged with publication and archival of FAIR and computationally reproducible research. Often the first re-users of the research compendium, they have the opportunity to verify that a computation can be executed and that it can reproduce pre-specified results. Secondarily, the 10 CURE-FAIR Things will be of interest to researchers, publishers, editors, reviewers, and others who have a stake in creating, using, sharing, publishing, or preserving reproducible research.">10 Things for Curating Reproducible and FAIR Research</span> </p> </td> <td> <p> Social sciences </p> </td> <td> <p> Best practices </p> </td> </tr> <tr> <td> <p> <a href="https://doi.org/10.1148/ryai.2020200029"><span title="Checklist for Artificial Intelligence in Medical Imaging (CLAIM): A Guide for Authors and Reviewers">Mongan et al <meta property="datePublished" content="2020-03-25">2020</span></a> </p> </td> <td> <p> <span title="Abstract">Checklist for Artificial Intelligence in Medical Imaging (CLAIM): A Guide for Authors and Reviewers</span> </p> </td> <td> <p> Medical imaging </p> </td> <td> <p> Best practices </p> </td> </tr> <tr> <td> <p> <a href="https://doi.org/10.1126/sciadv.abl4747"><span title="Generative and reproducible benchmarks or comprehensive evaluation machine learning classifiers">Orzechowski & Moore <meta property="datePublished" content="2022-11-23">2022</span></a> </p> </td> <td> <p> <span title="&lt;jats:p&gt;Understanding the strengths and weaknesses of machine learning (ML) algorithms is crucial to determine their scope of application. Here, we introduce the Diverse and Generative ML Benchmark (DIGEN), a collection of synthetic datasets for comprehensive, reproducible, and interpretable benchmarking of ML algorithms for classification of binary outcomes. The DIGEN resource consists of 40 mathematical functions that map continuous features to binary targets for creating synthetic datasets. These 40 functions were found using a heuristic algorithm designed to maximize the diversity of performance among multiple popular ML algorithms, thus providing a useful test suite for evaluating and comparing new methods. Access to the generative functions facilitates understanding of why a method performs poorly compared to other algorithms, thus providing ideas for improvement.&lt;/jats:p&gt;">Generative and reproducible benchmarks or comprehensive evaluation machine learning classifiers</span> </p> </td> <td> <p> Machine Learning </p> </td> <td> <p> Best practices </p> </td> </tr> <tr> <td> <p> <a href="https://doi.org/10.31219/osf.io/4pd9n"><span title="The five pillars of computational reproducibility: Bioinformatics and beyond">Ziemann et al <meta property="datePublished" content="2023-06-19">2023</span></a> </p> </td> <td> <p> <span title="&lt;p&gt;Computational reproducibility is a simple premise in theory, but is difficult to achieve in practice. Building upon past efforts and proposals to maximize reproducibility and rigor in bioinformatics, we present a framework called the five pillars of reproducible computational research. These include (1) code version control and sharing, (2) compute environment control, (3) literate programming, (4) documentation and (5) FAIR data sharing. These best practices will ensure that computational research work can be reproduced quickly and easily, long into the future. This guide is designed for bioinformatic data analysts, but should be relevant for other domains.&lt;/p&gt;">The five pillars of computational reproducibility: Bioinformatics and beyond</span> </p> </td> <td> <p> Bioinformatics </p> </td> <td> <p> Best practices </p> </td> </tr> <tr> <td> <p> <a href="https://doi.org/10.1098/rsos.220346"><span title="Big little lies: a compendium and simulation of p-hacking strategies">Stefan & Schönbrodt <meta property="datePublished" content="2023-02-08">2023</span></a> </p> </td> <td> <p> <span title="In many research fields, the widespread use of questionable research practices has jeopardized the credibility of scientific results. One of the most prominent questionable research practices is p-hacking. Typically, p-hacking is defined as a compound of strategies targeted at rendering non-significant hypothesis testing results significant. However, a comprehensive overview of these p-hacking strategies is missing, and current meta-scientific research often ignores the heterogeneity of strategies. Here, we compile a list of 12 p-hacking strategies based on an extensive literature review, identify factors that control their level of severity, and demonstrate their impact on false-positive rates using simulation studies. We also use our simulation results to evaluate several approaches that have been proposed to mitigate the influence of questionable research practices. Our results show that investigating p-hacking at the level of strategies can provide a better understanding of the process of p-hacking, as well as a broader basis for developing effective countermeasures. By making our analyses available through a Shiny app and R package, we facilitate future meta-scientific research aimed at investigating the ramifications of p-hacking across multiple strategies, and we hope to start a broader discussion about different manifestations of p-hacking in practice.">Big little lies: a compendium and simulation of p-hacking strategies</span> </p> </td> <td> <p> Research </p> </td> <td> <p> Statistical reproducibility </p> </td> </tr> <tr> <td> <p> <a href="https://doi.org/10.1371/journal.pbio.3002345"><span title="Is N-Hacking Ever OK? The consequences of collecting more data in pursuit of statistical significance">Reinagel <meta property="datePublished" content="2023-11-01">2023</span></a> </p> </td> <td> <p> <span title="&lt;jats:p&gt;Upon completion of an experiment, if a trend is observed that is “not quite significant,” it can be tempting to collect more data in an effort to achieve statistical significance. Such sample augmentation or “&lt;jats:italic&gt;N&lt;/jats:italic&gt;-hacking” is condemned because it can lead to an excess of false positives, which can reduce the reproducibility of results. However, the scenarios used to prove this rule tend to be unrealistic, assuming the addition of unlimited extra samples to achieve statistical significance, or doing so when results are not even close to significant; an unlikely situation for most experiments involving patient samples, cultured cells, or live animals. If we were to examine some more realistic scenarios, could there be any situations where &lt;jats:italic&gt;N&lt;/jats:italic&gt;-hacking might be an acceptable practice? This Essay aims to address this question, using simulations to demonstrate how &lt;jats:italic&gt;N&lt;/jats:italic&gt;-hacking causes false positives and to investigate whether this increase is still relevant when using parameters based on real-life experimental settings.&lt;/jats:p&gt;">Is N-Hacking Ever OK? The consequences of collecting more data in pursuit of statistical significance</span> </p> </td> <td> <p> Biology </p> </td> <td> <p> Statistical reproducibility </p> </td> </tr> <tr> <td> <p> <a href="https://arxiv.org/abs/2401.03068"><span title="A how-to guide for code-sharing in biology">Abdill et al <meta property="datePublished" content="2024-01-05">2024</span></a> </p> </td> <td> <p> <span title="Computational biology continues to spread into new fields, becoming more accessible to researchers trained in the wet lab who are eager to take advantage of growing datasets, falling costs, and novel assays that present new opportunities for discovery even outside of the much-discussed developments in artificial intelligence. However, guidance for implementing these techniques is much easier to find than guidance for reporting their use, leaving biologists to guess which details and files are relevant. Here, we provide a set of recommendations for sharing code, with an eye toward guiding those who are comparatively new to applying open science principles to their computational work. Additionally, we review existing literature on the topic, summarize the most common tips, and evaluate the code-sharing policies of the most influential journals in biology, which occasionally encourage code-sharing but seldom require it. Taken together, we provide a user manual for biologists who seek to follow code-sharing best practices but are unsure where to start.">A how-to guide for code-sharing in biology</span> </p> </td> <td> <p> Biology </p> </td> <td> <p> Best practices </p> </td> </tr> <tr> <td> <p> <a href="https://doi.org/10.2139/ssrn.4801433"><span title="Characterising Reproducibility Debt in Scientific Software:A Systematic Literature Review">Hassan et al <meta property="datePublished" content="2024-04-20">2024</span></a> </p> </td> <td> <p> <span title="Abstract">Characterising Reproducibility Debt in Scientific Software:A Systematic Literature Review</span> </p> </td> <td> <p> Reproducible Research </p> </td> <td> <p> Review of reviews </p> </td> </tr> <tr> <td> <p> <a href="https://doi.org/10.1016/j.cell.2024.07.045"><span title="The evolution of computational research in a data-centric world">Deshpande et al <meta property="datePublished" content="2024-08-22">2024</span></a> </p> </td> <td> <p> <span title="Abstract">The evolution of computational research in a data-centric world</span> </p> </td> <td> <p> Computational biology </p> </td> <td> <p> Review </p> </td> </tr> <tr> <td> <p> <a href="https://doi.org/10.1186/s13059-024-03343-2"><span title="Genomic reproducibility in the bioinformatics era">Baykal et al <meta property="datePublished" content="2024-08-09">2024</span></a> </p> </td> <td> <p> <span title="&lt;jats:title&gt;Abstract&lt;/jats:title&gt;&lt;jats:p&gt;In biomedical research, validating a scientific discovery hinges on the reproducibility of its experimental results. However, in genomics, the definition and implementation of reproducibility remain imprecise. We argue that genomic reproducibility, defined as the ability of bioinformatics tools to maintain consistent results across technical replicates, is essential for advancing scientific knowledge and medical applications. Initially, we examine different interpretations of reproducibility in genomics to clarify terms. Subsequently, we discuss the impact of bioinformatics tools on genomic reproducibility and explore methods for evaluating these tools regarding their effectiveness in ensuring genomic reproducibility. Finally, we recommend best practices to improve genomic reproducibility.&lt;/jats:p&gt;">Genomic reproducibility in the bioinformatics era</span> </p> </td> <td> <p> Genomics </p> </td> <td> <p> Review </p> </td> </tr> <tr> <td> <p> <a href="https://arxiv.org/abs/2411.02283"><span title="Continuous Analysis: Evolution of Software Engineering and Reproducibility for Science">Malladi et al <meta property="datePublished" content="2024-11-04">2024</span></a> </p> </td> <td> <p> <span title="&lt;jats:title&gt;Abstract&lt;/jats:title&gt;&lt;jats:p&gt;Reproducibility in research remains hindered by complex systems involving data, models, tools, and algorithms. Studies highlight a reproducibility crisis due to a lack of standardized reporting, code and data sharing, and rigorous evaluation. This paper introduces the concept of Continuous Analysis to address the reproducibility challenges in scientific research, extending the DevOps lifecycle. Continuous Analysis proposes solutions through version control, analysis orchestration, and feedback mechanisms, enhancing the reliability of scientific results. By adopting CA, the scientific community can ensure the validity and generalizability of research outcomes, fostering transparency and collaboration and ultimately advancing the field.&lt;/jats:p&gt;">Continuous Analysis: Evolution of Software Engineering and Reproducibility for Science</span> </p> </td> <td> <p> Science </p> </td> <td> <p> Best practices </p> </td> </tr> <!--theory_placeholder--> </tbody> </table>

Theses and dissertations

<table> <tbody> <tr> <td> <p> Authors/Date </p> </td> <td> <p> Title </p> </td> <td> <p> Institution </p> </td> </tr> <tr> <td> <p> <a href="https://newtraell.cs.uchicago.edu/files/phd_paper/quanpt.pdf">Pham, Quan <meta property="datePublished" content="2014-08-01">2014</a> </p> </td> <td> <p> <span title="In today’s world of publishing, reproducing research results has become challenging as scientific research has become inherently computational. Encoding a computation-based result in a text-based paper is nearly impractical, leading to the overarching research question. “Can computation-based research papers be reproducible?” The aim of this thesis is to describe frameworks and tools, which if provided to authors can aid in assessing computational reproducibility. Towards this aim, the thesis proposes a reproducible framework Science Object Linking and Embedding (SOLE) for creating descriptive and interactive publications by linking them with associated science objects, such as source codes, datasets, annotations, workflows, process and data provenance, and reexecutable software packages. To create science objects in a linkable representation for use within research papers, the thesis describes a set of tools as part of the framework. In particular, it focuses on Provenance-To-Use (PTU), an application virtualization tool that encapsulates source code, data, and all associated data and software dependencies into a package. We describe how by capturing data dependencies, PTU allows full and partial repeatability of the virtualized software; and by capturing software dependencies, PTU can be used for building and maintaining software pipelines. Finally, we show how PTU can be used to provide computational reproducibility in a distributed environment. We evaluate and validate the framework by applying it to several representative publications and determining the extent to which computational reproducibility is achievable.">A Framework for Reproducible Computational Research</span> </p> </td> <td> <p> University of Chicago </p> </td> </tr> <tr> <td> <p> <a href="https://www.proquest.com/docview/2454696322">Wallach, Joshua <meta property="datePublished" content="2016-11-01">2016</a> </p> </td> <td> <p> <span title="There is a growing movement to encourage reproducibility and transparency practices across all scientific fields, including public access to raw data and protocols and the conduct of replication studies. In this dissertation, I evaluate: the status and trends of reproducibility and transparency across all biomedical research fields; the reproducibility of subgroup analyses in randomized controlled trials; and the frequency, validity, and relevance of sex-based subgroup analyses in Cochrane reviews. Among a random sample of 441 biomedical journal articles published in 2000-2014, only one study provided a full protocol and none made raw data directly available. In general, replication studies were rare (n = 4). In a sample of 169 randomized controlled trials (from two datasets containing randomized controlled trials with at least one subgroup analysis), there were a total of 117 subgroup claims made in the abstracts of 64 articles. Of these claims, only 46 (39%) had evidence of statistically significant heterogeneity (P < 0.05 from an interaction test). Five (11%) of the 46 subgroup findings had at least one subsequent pure corroboration attempt by a meta-analysis or a randomized controlled trial. None of the corroboration attempts had a statistically significant P-value from an interaction test. Among 41 reviews, with at least one sex-treatment subgroup analysis in at least one forest plot, published in the Cochrane Database of Systematic Reviews, there were 109 separate treatment-outcome analyses (topics). Eight (7.3%) had a statistically significant sex-treatment interaction and only three were discussed by the Cochrane reviewers for a potential impact on different clinical management for males versus females. Overall, the current published biomedical literature lacks transparency. A minority of subgroup claims made in the abstracts of randomized controlled trials are supported by their own data, i.e., a significant interaction effect. Corroboration attempts of statistically significant subgroup differences are rare, and when done, the initially observed subgroup differences are not reproduced. Lastly, statistically significant sex-treatment interactions are only slightly more frequent than what would be expected by chance and there is little evidence of subsequent corroboration or clinical relevance of sex-treatment interactions.">Reproducible Research Practices, Scientific Transparency, and Subgroup Claims: A Meta-Research Dissertation</span> </p> </td> <td> <p> Stanford University </p> </td> </tr> <tr> <td> <p> <a href="https://doi.org/10.31237/osf.io/mcdrn">Konkol, Markus <meta property="datePublished" content="2019-04-05">2019</a> </p> </td> <td> <p> <span title="Open reproducible research (ORR) is the practice of publishing the source code and the datasets that are needed to produce the computational results reported in a paper. Since geoscientific articles often include geostatistical analyses and spatiotemporal data, reproducibility should be a cornerstone of the computational geosciences but is rarely realized. In addition, the current way of publishing scientific outcomes, i.e. as static PDFs, does not adequately report on computational aspects. Thus, readers cannot fully understand how the authors came to the conclusions and how robust these are to changes in the analysis. Consequently, it is difficult for reviewers to follow the analysis steps, and for other researchers to reuse existing materials. This dissertation has two overarching goals to tackle these issues: First, it aims at assisting authors in adhering to ORR principles to ensure high scientific standards. The second goal is to reap the benefits that come with papers supplemented by code and data. To achieve these objectives, this thesis reports on five complementary studies to collect and analyze qualitative (i.e. through interviews and a focus group) and quantitative data (i.e. through surveys and reproducibility studies). Based on these studies, this work provides four key contributions: First, it identifies obstacles that prevented geoscientists from publishing ORR. To overcome these barriers, this dissertation suggests concrete and directly applicable strategies. One of these strategies is the executable research compendium (ERC) which encapsulates the paper, code, data, and the entire software environment needed to produce the computational results. Based on that, this work presents the design and implementation of an ERC-based workflow. It allows authors to convey their computational methods and results by also providing interactive access to code and data, and readers to deeply investigate the computational analysis while reading the actual article, e.g. by changing the parameters of the analysis. Finally, this work describes the concept of a binding; a binding connects those code lines and data subsets that produce a specific result, e.g. a figure or number. By also considering user interface widgets (e.g. a slider), this approach allows readers to interactively manipulate the parameters of the analysis to see how these changes affect the result. To conclude, the contributions include (i) a set of obstacles which prevent geoscientists from publishing ORR, (ii) concepts and tools to overcome the identified barriers, (iii) incentives and opportunities that come with attached code and data, and (iv) solutions to realize the incentives which eventually result in a higher number of open and reproducible research.">Publishing Reproducible Geoscientific Papers: Status quo, benefits, and opportunities</span> </p> </td> <td> <p> University of Münster </p> </td> </tr> <tr> <td> <p> <a href="https://www.repository.cam.ac.uk/handle/1810/283607">Trisovic, Ana <meta property="datePublished" content="2018">2018</a> </p> </td> <td> <p> <span title="This dissertation presents the first study of data preservation and research reproducibility in data science at the Large Hadron Collider at CERN. In particular, provenance capture of the experimental data and the reproducibility of physics analyses at the LHCb experiment were studied. First, the preservation of the software and hardware dependencies of the LHCb experimental data and simulations was investigated. It was found that the links between the data processing information and the datasets themselves were obscure. In order to document these dependencies, a graph database was designed and implemented. The nodes in the graph represent the data with their processing information, software and computational environment, whilst the edges represent their dependence on the other nodes. The database provides a central place to preserve information that was previously scattered across the LHCb computing infrastructure. Using the developed database, a methodology to recreate the LHCb computational environment and to execute the data processing on the cloud was implemented with the use of virtual containers. It was found that the produced physics events were identical to the official LHCb data, meaning that the system can aid in data preservation. Furthermore, the developed method can be used for outreach purposes, providing a streamlined way for a person external to CERN to process and analyse the LHCb data. Following this, the reproducibility of data analyses was studied. A data provenance tracking service was implemented within the LHCb software framework \textsc{Gaudi}. The service allows analysts to capture their data processing configurations that can be used to reproduce a dataset within the dataset itself. Furthermore, to assess the current status of the reproducibility of LHCb physics analyses, the major parts of an analysis were reproduced by following methods described in publicly and internally available documentation. This study allowed the identification of barriers to reproducibility and specific points where documentation is lacking. With this knowledge, one can specifically target areas that need improvement and encourage practices that would improve reproducibility in the future. Finally, contributions were made to the CERN Analysis Preservation portal, which is a general knowledge preservation framework developed at CERN to be used across all the LHC experiments. In particular, the functionality to preserve source code from git repositories and Docker images in one central location was implemented.">Data preservation and reproducibility at the LHCb experiment at CERN</span> </p> </td> <td> <p> University of Cambridge </p> </td> </tr> <tr> <td> <p> <a href="https://arxiv.org/abs/2012.02570">Feger, Sebastian <meta property="datePublished" content="2020-11-10">2020</a> </p> </td> <td> <p> <span title="Reproducibility should be a cornerstone of science as it enables validation and reuse. In recent years, the scientific community and the general public became increasingly aware of the reproducibility crisis, i.e. the wide-spread inability of researchers to reproduce published work, including their own. Scientific research is increasingly focused on the creation, observation, processing, and analysis of large data volumes. On the one hand, this transition towards computational and data-intensive science poses new challenges for research reproducibility and reuse. On the other hand, increased availability and advances in computation and web technologies offer new opportunities to address the reproducibility crisis. This thesis reports on user-centered design research conducted at CERN, a key laboratory in data-intensive particle physics. In this thesis, we build a wider understanding of researchers' interactions with tools that support research documentation, preservation, and sharing. From a Human-Computer Interaction (HCI) perspective the following aspects are fundamental: (1) Characterize and map requirements and practices around research preservation and reuse. (2) Understand the wider role and impact of research data management (RDM) tools in scientific workflows. (3) Design tools and interactions that promote, motivate, and acknowledge reproducible research practices. Research reported in this thesis represents the first systematic application of HCI methods in the study and design of interactive tools for reproducible science. We advocate the unique role of HCI in supporting, motivating, and transforming reproducible research practices through the design of tools that enable effective RDM. This thesis paves new ways for interaction with RDM tools that support and motivate reproducible science.">Interactive Tools for Reproducible Science -- Understanding, Supporting, and Motivating Reproducible Science Practices</span> </p> </td> <td> <p> University of Munich </p> </td> </tr> <tr> <td> <p> <a href="https://zenodo.org/record/6403282">Leipzig, Jeremy <meta property="datePublished" content="2021-12-01">2021</a> </p> </td> <td> <p> <span title='Purpose: The purpose of this dissertation is to investigate the feasibility of using tests of robustness in peer review. This study involved selecting three high-impact papers which featured open data and utilized bioinformatic analyses but provided no source code and refactoring these to allow external survey participants to swap tools, parameters, and data subsets to evaluate the robustness and underlying validity of these analyses. Technical advances that have taken place in recent years - scientific computing infrastructure has matured to support the distribution of reproducible computational analyses - enable this approach. These advances, along with cultural shifts encompassing open data and open code initiatives, promise to address technical stumbling blocks that have contributed to the "reproducibility crisis." To take full advantage of these developments toward improving scientific quality, authors, reviewers, and publishers must integrate reproducible analysis into the peer review process. Seven existing major case study types - reproduction, replication, refactor, robustness test, survey, census, and case narrative - have been invaluable toward establishing reproducibility as a serious and independent area of research. Of particular interest are refactors, in which an existing analysis with abstract methods is reimplemented by a third party, and robustness tests, which involve the manipulation of tools, parameters, and data to assess the scientific validity of an analysis. This thesis describes efforts to test the feasibility of robustness testing in the context of in silico peer review. The contributions described are complemented with extensive source code. Design and Methods: A multi-method approach was employed for this study consisting of user surveys and tests of robustness - hands-on, self-directed software development exercises. Three high-impact genomics publications with open data, but no source code, were selected, refactored, and distributed to active study participants who acted as quasi-external reviewers. The process of the refactor was used to evaluate the limitations of reproducibility using conventional tools and to study how best to present analyses for peer review, and the tests of robustness were employed under the hypothesis this practice would help to evaluate the underlying validity of an analysis. Three different approaches were taken in these tests of robustness - a faithful reproduction of the original manuscript into a framework that could be manipulated by participants, a workflow-library approach in which participants were encouraged to employ modern "off-the-shelf" pre-built pipelines to triangulate tests, and an advisor-led approach in which senior experts suggested alternate tools to be implemented and I generated a report for their evaluation. Findings: The refactors and tests of robustness produced numerous discoveries both in terms of the underlying scientific content and, more importantly, into the strengths and weakness of the three robustness approaches (faithful/workflow-library/advisor-led) and pain points in the analytic stack, which may be addressed with appropriate software and metadata. The principal findings are that the faithful approach may often discourage aggressive robustness testing because of the inertia imposed by the existing framework, the workflow-library approach is efficient but can prove inconclusive, and the advisor-led approach may be most practical for journals but requires a higher level of communication to be effective. The vast majority of time in all these refactors was spent on sample metadata management, particularly organizing sample groups of biological and technical replicates to produce the numerous and varied tool input manifests. Practical Implications: Reproducibility-enabled in silico peer review is substantially more time-consuming than traditional manuscript peer review and will require economic, cultural, and technical change to bring to reality. The work presented here could contribute to developing new models to minimize the increased effort of this type of peer review while incentivizing reproducibility.Value: This study provides practical guidance toward designing the future of reproducibility-enabled in silico peer review, which is a logical extension of the computational reproducibility afforded by technical advances in dependency management, containerization, pipeline frameworks, and notebooks.'>Tests of Robustness in Peer Review</span> </p> </td> <td> <p> Drexel University </p> </td> </tr> <tr> <td> <p> <a href="https://zenodo.org/record/4768096">Nüst, Daniel <meta property="datePublished" content="2022-02-14">2022</a> </p> </td> <td> <p> <span title="Reproducibility of computational research, i.e., research based on code and data, poses enormous challenges to all branches of science. In this dissertation, technologies and practices are developed to increase reproducibility and to connect it better with the process of scholarly communication with a particular focus on geography, geosciences, and GIScience. Based on containerisation, this body of work creates a platform that connects existing academic infrastructures with a newly established executable research compendium (ERC). It is shown how the ERC can improve transparency, understandability, reproducibility, and reusability of research outcomes, e.g., for peer review, by capturing all parts of a workflow for computational research. The core part of the ERC platform is software that can automatically capture the computing environment, requiring authors only to create computational notebooks, which are digital documents that combine text and analysis code. The work further investigates how containerisation can be applied independent of ERCs to package complex workflows using the example of remote sensing, to support data science in general, and to facilitate diverse use cases within the R language community. Based on these technical foundations, the work concludes that functioning practical solutions exist for making reproducibility possible through infrastructure and making reproducibility easy through user experience. Several downstream applications built on top of ERCs provide novel ways to discover and inspect the next generation of publications. To understand why reproducible research has not been widely adopted and to contribute to the propagation of reproducible research practices, the dissertation continues to investigate the state of reproducibility in GIScience and develops and demonstrates workflows that can better integrate the execution of computational analyses into peer review procedures. We make recommendations for how to (re)introduce reproducible research into peer reviewing and how to make practices to achieve the highest possible reproducibility normative, rewarding, and, ultimately, required in science. These recommendations are rest upon over 100 GIScience papers which were assessed as irreproducible, the experiences from over 30 successful reproductions of workflows across diverse scientific fields, and the lessons learned from implementing the ERC. Besides continuing the development of the contributed concepts and infrastructure, the dissertation points out broader topics of future work, such as surveying practices for code execution during peer review of manuscripts, or reproduction and replication studies of the fundamental works in the considered scientific disciplines. The technical and social barriers to higher reproducibility are strongly intertwined with other transformations in academia, and, therefore, improving reproducibility meets similar challenges around culture change and sustainability. However, we clearly show that reproducible research is achievable today using the newly developed infrastructures and practices. The transferability of cross-disciplinary lessons facilitates the establishment of reproducible research practices and, more than other transformations, the movement towards greater reproducibility can draw from accessible and convincing arguments both for individual researchers as well as for their communities.">Infrastructures and Practices for Reproducible Research in Geography, Geosciences, and GIScience</span> </p> </td> <td> <p> University of Münster </p> </td> </tr> <tr> <td> <p> <a href="https://www.proquest.com/docview/2314809140">Melcher, Wiebke <meta property="datePublished" content="20190101">2019</a> </p> </td> <td> <p> <span title="abstract">Free will in psychological research : considerations on methodic procedure and reproducibility of results</span> </p> </td> <td> <p> Leuphana University </p> </td> </tr> <tr> <td> <p> <a href="https://www.proquest.com/docview/1913432747">Abang Ibrahim, Dayang <meta property="datePublished" content="20160101">2016</a> </p> </td> <td> <p> <span title="abstract">The exploitation of provenance and versioning in the reproduction of e-experiments</span> </p> </td> <td> <p> University of Newcastle Upon Tyne </p> </td> </tr> <tr> <td> <p> <a href="https://www.proquest.com/docview/2506658489">Henderson, Peter <meta property="datePublished" content="20180101">2018</a> </p> </td> <td> <p> <span title="abstract">Reproducibility and Reusability in Deep Reinforcement Learning</span> </p> </td> <td> <p> McGill University </p> </td> </tr> <tr> <td> <p> <a href="https://www.proquest.com/docview/2089435197">Drimer-Batca, Daniel <meta property="datePublished" content="20180101">2018</a> </p> </td> <td> <p> <span title="abstract">Reproducibility Crisis in Science: Causes and Possible Solutions</span> </p> </td> <td> <p> Boston University  </p> </td> </tr> <tr> <td> <p> <a href="https://www.proquest.com/docview/2564441231">Matheson, Granville <meta property="datePublished" content="20180101">2018</a> </p> </td> <td> <p> <span title="abstract">Reliability, Replicability and Reproducibility in Pet Imaging</span> </p> </td> <td> <p> Karolinska Institutet </p> </td> </tr> <tr> <td> <p> <a href="https://www.proquest.com/docview/1859059701">Patil, Prasad <meta property="datePublished" content="20160101">2016</a> </p> </td> <td> <p> <span title="abstract">Assessing reproducibility and value in genomic signatures</span> </p> </td> <td> <p> The Johns Hopkins University  </p> </td> </tr> <tr> <td> <p> <a href="https://www.proquest.com/docview/1827523160">Ahmad, MKH <meta property="datePublished" content="20160101">2016</a> </p> </td> <td> <p> <span title="abstract">Scientific workflow execution reproducibility using cloud-aware provenance</span> </p> </td> <td> <p> University of the West of England, Bristol </p> </td> </tr><tr> <td> <p> <a href="https://doi.org/10.22032/dbt.40396">Samuel, Sheeba <meta property="datePublished" content="20122019">2019</a> </p> </td> <td> <p> <span title="Understandability and reproducibility of scientific results are vital in every field of science. Several reproducibility measures are being taken to make the data used in the publications findable and accessible. However, there are many challenges faced by scientists from the beginning of an experiment to the end in particular for data management. The explosive growth of heterogeneous research data and understanding how this data has been derived is one of the research problems faced in this context. Interlinking the data, the steps and the results from the computational and non-computational processes of a scientific experiment is important for the reproducibility. We introduce the notion of end-to-end provenance management'' of scientific experiments to help scientists understand and reproduce the experimental results. The main contributions of this thesis are: (1) We propose a provenance modelREPRODUCE-ME'' to describe the scientific experiments using semantic web technologies by extending existing standards. (2) We study computational reproducibility and important aspects required to achieve it. (3) Taking into account the REPRODUCE-ME provenance model and the study on computational reproducibility, we introduce our tool, ProvBook, which is designed and developed to demonstrate computational reproducibility. It provides features to capture and store provenance of Jupyter notebooks and helps scientists to compare and track their results of different executions. (4) We provide a framework, CAESAR (CollAborative Environment for Scientific Analysis with Reproducibility) for the end-to-end provenance management. This collaborative framework allows scientists to capture, manage, query and visualize the complete path of a scientific experiment consisting of computational and non-computational steps in an interoperable way. We apply our contributions to a set of scientific experiments in microscopy research projects.">A provenance-based semantic approach to support understandability, reproducibility, and reuse of scientific experiments</span> </p> </td> <td> <p> Friedrich-Schiller-Universität Jena </p> </td> </tr> <tr> <td> <p> <a href="https://doi.org/10.22032/dbt.40396">Wayant, Christian Cole <meta property="datePublished" content="20210924">2021</a> </p> </td> <td> <p> <span title="The burden of cancer in the United States and abroad is comprised of significant morbidity, mortality, and psychological or financial harms. There remains a concern that the influence of published research is not maximized because of bias, lack of reproducibility, and suboptimal transparency. This dissertation comprises 10 investigations of such shortcomings. As a result of these 10 studies we first found that oncology journal policies on reporting guidelines and trial registration could be improved to strengthen the transparency in published research. We found that key improvements to oncology interventions in trials could facilitate better translation of published results to daily clinical practice. An investigation of financial relationships between oncologist-authors of influential trials and pharmaceutical drug firms uncovered pervasive, large, often undisclosed conflicts of interest. In a cohort of published trials, we found that oncologist authors misrepresented or distorted their findings to highlight favorable findings, even if this meant downplaying patient-centered endpoint results. We evaluated the potential harm from the publication of interim trial reports before patient-centered endpoints have accrued the necessary events to be fully powered. We reviewed a broad cohort of drug advertisements and found that drug firms omitted endpoints that were unfavorable, potentially compromising the integrity of the drug's advertised efficacy. We found that noninferiority trials, which are increasingly important in oncology research, were poorly designed and used statistical practices which may compromise their robustness. We turned to systematic reviews, finding that one's ability to reproduce the results of oncology meta-analyses was compromised by incomplete reporting of basic patient data. We found a significant risk of bias in systematic reviews cited by prominent cancer practice guidelines were at risk of bias. We investigated prominent cancer practice guidelines and found that patient values and preferences were undervalued. Altogether, the results of these ten studies indicate that oncology research requires a number of major and minor improvements to maximize its ability to work fully for the patient's benefit.">Rigor and reproducibility of cancer medicine evidence</span> </p> </td> <td> <p> Oklahoma State University </p> </td> </tr> <tr> <td> <p> <a href="https://doi.org/10.5821/dissertation-2117-351121">Vilaró Pacheco, Marta <meta property="datePublished" content="20210715">2021</a> </p> </td> <td> <p> <span title="This PhD thesis is concerned with evaluating the long-term effects of interventions in the editorial peer review process of a biomedical journal. The growing need to increase the value of research and avoid waste motivates this work. We focus on evaluating the long-term impact on the number of citations (NC) of articles introduced in trials, as this will allow us to evaluate the effects of adding either a methodological expert (statistician) or reporting guidelines (RG) during the peer review process. In 2004 and 2009, two randomized trials were published in Medicina Clínica (Elsevier, Barcelona), in which these interventions were added during the editorial process. They showed a slightly positive e¿ect on paper quality, as assessed by the Manuscript Quality Assessment Instrument (MQAI) of Goodman et al. In this work, we explore the effect on NC by collecting from Web of Science (WoS) the NC of each article that had been randomized in the two previous studies. This thesis presents different ways of analyzing count data involving time-to-event, first from a perspective of count data models and then from a perspective of recurrent events methods. Results show that including a methodological reviewer (for example, a senior statistician) who is dedicated to looking for missing RG items increases the NC by 40% (95% CI: 1% to 94%) when considering that citations are independent within articles, like count models. When considering the within-article correlation of citations using the frailty gamma model, we find that including a methodological reviewer during the peer-review process increases the possibility of receiving a citation by 41% (95% CI: 0% to 102%). Independently of the method used to analyze NC, and with different assumptions, the consistency of those results gives robustness to the findings. Therefore, as measured by the NC, with this work we show that randomized studies of using interventions in the peer review process to improve scientific impact are feasible. Our proof of concept study opens the door for the development of confirmatory trials.">Long-term impact of an editorial intervention to improve paper transparency and reproducibility</span> </p> </td> <td> <p> Universitat Politècnica de Catalunya </p> </td> </tr> <tr> <td> <p> <a href="https://etda.libraries.psu.edu/files/final_submissions/29211">Sebastian, Aswathy <meta property="datePublished" content="20231013">2023</a> </p> </td> <td> <p> <span title="Bioinformatics research involves managing various types of data and employing diverse computational methods. Workflows consolidate the discrete data analytics tasks into a unified process. Automation, defined as using computational technologies with minimal human intervention, proves vital in addressing the increasing complexity of workflows and meeting scientific standards for reproducibility, scalability, and reusability. Nevertheless, integrating varied datasets and software tools remains challenging. This dissertation focuses on identifying suitable data analysis techniques, creating reusable and reproducible analysis pipelines, and using them for scientific discovery. The developed computational methods are applied explicitly to areas such as genome assembly, transcriptome analysis, and metabarcoding studies.">Advancing Genomic and Transcriptomic Knowledge Through Reproducible Bioinformatics Workflows</span> </p> </td> <td> <p> Penn State University </p> </td> </tr> <!--dissertation_placeholder--> </tbody> </table>

Tool reviews

<table> <tbody> <tr> <td> <p> Authors/Date </p> </td> <td> <p> Title </p> </td> <td> <p> Tools </p> </td> </tr> <tr> <td> <p> <a href="http://doi.org/10.1109/eScience.2019.00017">Isdahl & Gundersen <meta property="datePublished" content="2019-09-01">2019</a> </p> </td> <td> <p> <span title="Even machine learning experiments that are fully conducted on computers are not necessarily reproducible. An increasing number of open source and commercial, closed source machine learning platforms are being developed that help address this problem. However, there is no standard for assessing and comparing which features are required to fully support repro-ducibility. We propose a quantitative method that alleviates this problem. Based on the proposed method we assess and compare the current state of the art machine learning platforms for how well they support making empirical results reproducible. Our results show that BEAT and Floydhub have the best support for reproducibility with Codalab and Kaggle as close contenders. The most commonly used machine learning platforms provided by the big tech companies have poor support for reproducibility.">Out-of-the-box Reproducibility: A Survey of Machine Learning Platforms</span> </p> <td> <p> MLflow, Polyaxon, StudioML, Kubeflow, CometML, Sagemaker, GCPML, AzureML, Floydhub, BEAT, Codalab, Kaggle </p> </td> </tr> <tr> <td> <p> <a href="https://doi.org/10.1145/3311955">Pimentel et al <meta property="datePublished" content="2019-06-19">2019</a> </p> </td> <td> <p> <span title="Scripts are widely used to design and run scientific experiments. Scripting languages are easy to learn and use, and they allow complex tasks to be specified and executed in fewer steps than with traditional programming languages. However, they also have important limitations for reproducibility and data management. As experiments are iteratively refined, it is challenging to reason about each experiment run (or trial), to keep track of the association between trials and experiment instances as well as the differences across trials, and to connect results to specific input data and parameters. Approaches have been proposed that address these limitations by collecting, managing, and analyzing the provenance of scripts. In this article, we survey the state of the art in provenance for scripts. We have identified the approaches by following an exhaustive protocol of forward and backward literature snowballing. Based on a detailed study, we propose a taxonomy and classify the approaches using this taxonomy.">A Survey on Collecting, Managing, and Analyzing Provenance from Scripts</span> </p> </td> <td> <p> Astro-Wise, CPL, CXXR, Datatrack, ES3, ESSW, IncPy, Lancet, Magni, noWorkflow, Provenance Curios, pypet, RDataTracker, Sacred, SisGExp, SPADE, StarFlow, Sumatra, Variolite, VCR, versuchung, WISE, YesWorkflow </p> </td> </tr> </tr> <tr> <td> <p> <a href="https://doi.org/10.1016/j.patter.2021.100322">Leipzig et al <meta property="datePublished" content="2021-09-10">2021</a> (<a href="https://github.com/leipzig/metadata-in-rcr">supplemental</a>) </p> </td> <td> <p> <span title="Reproducible computational research (RCR) is the keystone of the scientific method for in silico analyses, packaging the transformation of raw data to published results. In addition to its role in research integrity, RCR has the capacity to significantly accelerate evaluation and reuse. This potential and wide-support for the FAIR principles have motivated interest in metadata standards supporting RCR. Metadata provides context and provenance to raw data and methods and is essential to both discovery and validation. Despite this shared connection with scientific data, few studies have explicitly described the relationship between metadata and RCR. This article employs a functional content analysis to identify metadata standards that support RCR functions across an analytic stack consisting of input data, tools, notebooks, pipelines, and publications. Our article provides background context, explores gaps, and discovers component trends of embeddedness and methodology weight from which we derive recommendations for future work.">The Role of Metadata in Reproducible Computational Research</span> </p> </td> <td> <p> CellML, CIF2, DATS, DICOM, EML, FAANG, GBIF, GO, ISO/TC 276, MIAME, NetCDF, OGC, ThermoML, CRAN, Conda, pip setup.cfg, EDAM, CodeMeta, Biotoolsxsd, DOAP, ontosoft, SWO, OBCS, STATO, SDMX, DDI, MEX, MLSchema, MLFlow, Rmd, CWL, CWLProv, RO-Crate, RO, WICUS, OPM, PROV-O, ReproZip, ProvOne, WES, BagIt, BCO, ERC, BEL, DC, JATS, ONIX, MeSH, LCSH, MP, Open PHACTS, SWAN, SPAR, PWO, PAV, Manubot, ReScience, PandocScholar </p> </td> </tr> <tr> <td> <p> <a href="https://doi.org/10.1186/s41073-020-00095-y">Konkol, Markus, Nüst, Daniel, Goulier, Laura <meta property="datePublished" content="2020-06-15">2020</a> </p> </td> <td> <p> <span title="The trend toward open science increases the pressure on authors to provide access to the source code and data they used to compute the results reported in their scientific papers. Since sharing materials reproducibly is challenging, several projects have developed solutions to support the release of executable analyses alongside articles. We reviewed 11 applications that can assist researchers in adhering to reproducibility principles. The applications were found through a literature search and interactions with the reproducible research community. An application was included in our analysis if it (i) was actively maintained at the time the data for this paper was collected, (ii) supports the publication of executable code and data, (iii) is connected to the scholarly publication process. By investigating the software documentation and published articles, we compared the applications across 19 criteria, such as deployment options and features that support authors in creating and readers in studying executable papers. From the 11 applications, eight allow publishers to self-host the system for free, whereas three provide paid services. Authors can submit an executable analysis using Jupyter Notebooks or R Markdown documents (10 applications support these formats). All approaches provide features to assist readers in studying the materials, e.g., one-click reproducible results or tools for manipulating the analysis parameters. Six applications allow for modifying materials after publication.">Publishing computational research - a review of infrastructures for reproducible and transparent scholarly communication</span> </p> </td> <td> <p> Authorea, Binder, CodeOcean, eLife RDS, Galaxy Project, Gigantum, Manuscript, o2r, REANA, ReproZip, Whole tale </p> </td> </tr> <!--tools_placeholder--> </tbody> </table>

Courses

Development Resources

Literature tools

Scientific Data Management Systems

Books

Databases

Data Repositories

All these repositories assign Digital Object Identifiers (DOIs) to data

Exemplar Portals

Places to find papers with code or portals to host them

Runnable Papers

Experimental papers that have associated notebooks

Haibe-Kains lab

<table> <tbody> <tr> <th> Publication </th> <th> CodeOcean link </th> </tr> <tr> <td> <a href="https://doi.org/10.1101/471227">Mer AS et al. Integrative Pharmacogenomics Analysis of Patient Derived Xenografts</a> </td> <td> <a href="https://codeocean.com/capsule/0566399">codeocean.com/capsule/056639</a> </td> </tr> <tr> <td> <a href="https://doi.org/10.1101/052910">Gendoo, Zon et al. MetaGxData: Clinically Annotated Breast, Ovarian and Pancreatic Cancer Datasets and their Use in Generating a Multi-Cancer Gene Signature</a> </td> <td> <a href="https://codeocean.com/capsule/6438633">codeocean.com/capsule/643863</a> </td> </tr> <tr> <td> <a href="https://doi.org/10.1093/jamia/ocx062">Yao et al. Tissue specificity of in vitro drug sensitivity</a> </td> <td> <a href="https://codeocean.com/capsule/5502756">codeocean.com/capsule/550275</a> </td> </tr> <tr> <td> <a href="https://doi.org/10.1038/s41467-017-02136-5">Safikhani Z et al. Gene isoforms as expression-based biomarkers predictive of drug response in vitro</a> </td> <td> <a href="https://codeocean.com/capsule/0002901">codeocean.com/capsule/000290</a> </td> </tr> <tr> <td> <a href="https://doi.org/10.1158/0008-5472.CAN-17-0096">El-Hachem et al. Integrative cancer pharmacogenomics to infer large-scale drug taxonomy</a> </td> <td> <a href="https://codeocean.com/capsule/4252248">codeocean.com/capsule/425224</a> </td> </tr> <tr> <td> <a href="https://doi.org/10.12688/f1000research.9611.3">Safikhani Z et al. Revisiting inconsistency in large pharmacogenomic studies</a> </td> <td> <a href="https://codeocean.com/capsule/6276064">codeocean.com/capsule/627606</a> </td> </tr> <tr> <td> <a href="https://doi.org/10.1101/355602">Sandhu V et al. Meta-analysis of 1,200 transcriptomic profiles identifies a prognostic model for pancreatic ductal adenocarcinoma</a> </td> <td> <a href="https://codeocean.com/capsule/2693620">codeocean.com/capsule/269362</a> </td> </tr> <tr> <td> <a href="https://doi.org/10.1093/bib/bbab294">Sharifi-Noghabi et al Drug sensitivity prediction from cell line-based pharmacogenomics data: guidelines for developing machine learning models </td> <td> <a href="https://codeocean.com/capsule/7358839">codeocean.com/capsule/7358839</a> </td> </tr> <tr> <td> <a href="https://doi.org/10.1016/j.phro.2021.04.001">Arrowsmith et al Automated detection of dental artifacts for large-scale radiomic analysis in radiation oncology </td> <td> <a href="https://codeocean.com/capsule/2097894">codeocean.com/capsule/2097894</a> </td> </tr> <tr> <td> <a href="https://doi.org/10.1038/s41467-021-21233-0">Mer et al Biological and therapeutic implications of a unique subtype of NPM1 mutated AML </td> <td> <a href="https://codeocean.com/capsule/8791617">codeocean.com/capsule/8791617</a> </td> </tr> <tr> <td> <a href="https://doi.org/10.1101/2020.09.08.287573">Ortmann et al KuLGaP: A Selective Measure for Assessing Therapy Response in Patient-Derived Xenografts </td> <td> <a href="https://codeocean.com/capsule/2817911">codeocean.com/capsule/2817911</a> </td> </tr> <tr> <td> <a href="https://doi.org/10.1038/s41467-020-20830-9">Madani Tonekaboni et al Large organized chromatin lysine domains help distinguish primitive from differentiated cell populations </td> <td> <a href="https://codeocean.com/capsule/6911149">codeocean.com/capsule/6911149</a> </td> </tr> <tr> <td> <a href="https://doi.org/10.1093/nar/gkaa421">Seo et al SYNERGxDB: an integrative pharmacogenomic portal to identify synergistic drug combinations for precision oncology </td> <td> <a href="https://codeocean.com/capsule/6322807">codeocean.com/capsule/6322807</a> </td> </tr> <tr> <td> <a href="https://doi.org/10.1038/s41597-019-0174-7">Mammoliti et al Creating reproducible pharmacogenomic analysis pipelines </td> <td> <a href="https://codeocean.com/capsule/6718332">codeocean.com/capsule/6718332</a> </td> </tr> <tr> <td> <a href="https://doi.org/10.1158/0008-5472.CAN-19-0179">Manem et al Modeling Cellular Response in Large-Scale Radiogenomic Databases to Advance Precision Radiotherapy </td> <td> <a href="https://codeocean.com/capsule/1166221">codeocean.com/capsule/1166221</a> </td> </tr> <tr> <td> <a href="https://doi.org/10.1101/222562">Tonekaboni et al CREAM: Clustering of genomic REgions Analysis Method </td> <td> <a href="https://codeocean.com/capsule/0002901">codeocean.com/capsule/0002901</a> </td> </tr> <tr> <td> <a href="https://doi.org/10.1093/bioinformatics/btz485">Madani Tonekaboni et al SIGN: similarity identification in gene expression </td> <td> <a href="https://codeocean.com/capsule/0544852">codeocean.com/capsule/0544852</a> </td> </tr> <tr> <td> <a href="https://doi.org/10.1158/0008-5472.CAN-19-0349">Mer et al Integrative Pharmacogenomics Analysis of Patient-Derived Xenografts </td> <td> <a href="https://codeocean.com/capsule/0566399">codeocean.com/capsule/0566399</a> </td> </tr> <tr> <td> <a href="https://doi.org/10.1007/978-981-32-9816-3_19">Sandhu et al Applications of Computational Systems Biology in Cancer Signaling Pathways </td> <td> <a href="https://codeocean.com/capsule/0795540">codeocean.com/capsule/0795540</a> </td> </tr> <tr> <td> <a href="https://doi.org/10.1200/CCI.18.00102">Sandhu et al Meta-Analysis of 1,200 Transcriptomic Profiles Identifies a Prognostic Model for Pancreatic Ductal Adenocarcinoma </td> <td> <a href="https://codeocean.com/capsule/7402260">codeocean.com/capsule/7402260</a> </td> </tr> <tr> <td> <a href="https://doi.org/10.1101/052910">Gendoo et al MetaGxData: Clinically Annotated Breast, Ovarian and Pancreatic Cancer Datasets and their Use in Generating a Multi-Cancer Gene Signature </td> <td> <a href="https://codeocean.com/capsule/6438633">codeocean.com/capsule/6438633</a> </td> </tr> <tr> <td> <a href="https://doi.org/10.1093/jamia/ocx062">Yao et al Tissue specificity of in vitro drug sensitivity </td> <td> <a href="https://codeocean.com/capsule/5502756">codeocean.com/capsule/5502756</a> </td> </tr> <tr> <td> <a href="https://doi.org/10.1038/s41467-017-01153-8">Safikhani et al Gene isoforms as expression-based biomarkers predictive of drug response in vitro </td> <td> <a href="https://codeocean.com/capsule/0002901">codeocean.com/capsule/0002901</a> </td> </tr> <tr> <td> <a href="https://doi.org/10.1158/0008-5472.CAN-17-0096">El-Hachem et al Integrative Cancer Pharmacogenomics to Infer Large-Scale Drug Taxonomy </td> <td> <a href="https://codeocean.com/capsule/4252248">codeocean.com/capsule/4252248</a> </td> </tr> <tr> <td> <a href="https://doi.org/10.12688/f1000research.9611.3">Safikhani et al Revisiting inconsistency in large pharmacogenomic studies </td> <td> <a href="https://codeocean.com/capsule/6276064">codeocean.com/capsule/6276064</a> </td> </tr> <!--codeocean_placeholder--> </tbody> </table>

Patcher lab

<table> <tr> <th> Publication </th> <th> Github link </th> </tr> <tr> <td> <a href="https://www.nature.com/articles/nmeth.4324">Pimental et al 2017. Differential analysis of RNA-seq incorporating quantification uncertainty</a> </td> <td> <a href="https://github.com/pachterlab/sleuth_paper_analysis">sleuth_paper_analysis</a> </td> </tr> <tr> <td> <a href="https://www.biorxiv.org/content/10.1101/673285v1">Melsted et al 2019. Modular and efficient pre-processing of single-cell RNA-seq</a> </td> <td> <a href="https://github.com/pachterlab/MBGBLHGP_2019">MBGBLHGP_2019</a> </td> </tr> <tr> <td> <a href="https://doi.org/10.1101/2021.01.22.427844">Chari et al 2021. Whole Animal Multiplexed Single-Cell RNA-Seq Reveals Plasticity of Clytia Medusa Cell Types</a> </td> <td> <a href="https://github.com/pachterlab/CWGFLHGCCHAP_2021">CWGFLHGCCHAP_2021</a> </td> </tr> </table>

Siepel lab

<table> <tr> <td> <a href="https://bmcbiol.biomedcentral.com/articles/10.1186/s12915-021-00949-x">Blumberg et al 2021. Characterizing RNA stability genome-wide through combined analysis of PRO-seq and RNA-seq data</a> </td> <td> <a href="https://codeocean.com/capsule/7351682">https://codeocean.com/capsule/7351682</a> </td> </tr> </table>

Journals

Ontologies

Minimal Standards

Organizations

Awesome Lists

Contribute

Contributions welcome! Read the contribution guidelines first. You may find my src/doi2md.py script useful for quickly generating entries from a DOI.

License

CC0

To the extent possible under law, Jeremy Leipzig has waived all copyright and related or neighboring rights to this work.