A curated list of reproducible research case studies, projects, tutorials, and media
- Case studies
- Ad-hoc reproductions
- Courses
- Development Resources
- User tools
- Books
- Data Repositories
- Examples and exemplars
- Journals
- Ontologies
- Organizations
- Awesome Lists
The term "case studies" is used here in a general sense to describe any study of reproducibility. A reproduction is an attempt to arrive at comparable results with identical data using computational methods described in a paper. A refactor involves refactoring existing code into frameworks and other reproducibility best practices while preserving the original data. A replication involves generating new data and applying existing methods to achieve comparable results. A robustness test applies various statistical models or parameters to a given data set to study their effect on results. A census is a high-level tabulation conducted by a third party. A survey is a questionnaire sent to practitioners. A case narrative is an in-depth first-person account. A theoretical case study measures global reproducibility using non-empirical evidence. An independent discussion utilizes a secondary independent author to interpret the results of a study as a means to improve inferential reproducibility.
Study |
Field |
Approach |
Size |
Science |
Theoretical |
(all studies) |
|
Medicine |
Census |
80 studies |
|
Cancer biology |
Refactor |
8 studies |
|
Biostatistics |
Census |
56 studies |
|
Genetics |
Reproduction |
18 studies |
|
Software engineering |
Replication |
4 companies |
|
Signal processing |
Census |
134 papers |
|
Biomedical sciences |
Survey |
23 PIs |
|
Bioinformatics |
Census |
100 studies |
|
Cancer biology |
Replication |
53 studies |
|
Computer science |
Census |
613 papers |
|
Psychology |
Replication |
100 studies |
|
Biomedical sciences |
Census |
100 papers |
|
Epidemiology |
Robustness test |
417 variables |
|
NLP |
Replication |
3 studies |
|
Cancer biology |
Replication |
9 studies |
|
Biomedical sciences |
Census |
318 journals |
|
Science |
Case narrative |
31 PIs |
|
Biological sciences
|
Survey |
704 PIs |
|
Bioinformatics |
Refactor |
1 study |
|
Economics |
Replication |
18 studies |
|
Machine learning |
Census |
30 studies |
|
Archaeology |
Case narrative |
1 survey |
|
Comparative toxicogenomics |
Census |
51,292 claims in 3,363 papers |
|
Artificial intelligence |
Census |
400 papers |
|
Economics |
Census |
203 papers |
|
Computational science |
Reproduction |
204 articles, 180 authors |
|
Genomics |
Case narrative |
1 study |
|
Social sciences |
Replication |
21 papers |
|
Psychology |
Robustness test |
One data set, 29 analyst teams |
|
Medicine and health sciences |
Census |
30 papers |
|
Microbiome immuno oncology |
Replication |
1 paper |
|
Bioinformatics |
Refactor and test of robustness |
1 paper |
|
Biomedical Sciences |
Census |
149 papers |
|
Bioinformatics |
Synthetic replication & refactor |
1 paper |
|
Geosciences |
Survey, Reproduction |
146 scientists, 41 papers |
|
Reinforcement Learning |
Reproduction, case narrative |
1 paper |
|
Science & Engineering |
Survey |
215 participants |
|
Nephrology |
Robustness test |
1 paper |
|
Social sciences & other |
Census |
810 Dataverse studies |
|
Geosciences |
Survey |
360 papers |
|
Deep learning |
Robustness test |
1 analysis |
|
Genomics |
Case narrative |
1 analysis |
|
Pharmacogenomics |
Case narrative |
2 analyses |
|
Biomedical sciences and Psychology |
Census |
127 registered reports |
|
All |
Census |
1,159,166 Jupyter notebooks |
|
Anaesthesia |
Indepedent discussion |
1 study |
|
Psychology |
Replication |
1 paper |
|
Machine learning |
Reproduction |
18 conference papers |
|
Experimental archaeology |
Replication |
1 theory |
|
Neurology |
Census |
202 papers |
|
Psychology |
Replication |
2 experiments |
These are one-off unpublished attempts to reproduce individual studies
Reproduction |
Original study |
https://rdoodles.rbind.io/2019/06/reanalyzing-data-from-human-gut-microbiota-from-autism-spectrum-disorder-promote-behavioral-symptoms-in-mice/ and https://notstatschat.rbind.io/2019/06/16/analysing-the-mouse-autism-data/ |
Sharon, G. et al. Human Gut Microbiota from Autism Spectrum Disorder Promote Behavioral Symptoms in Mice. Cell 2019, 177 (6), 1600–1618.e17. |
Wei, X.; Nielsen, R. CCR5-∆32 Is Deleterious in the Homozygous State in Humans. Nat. Med. 2019 DOI: 10.1038/s41591-019-0459-6. |
- MOOCs
- Coursera Reproducible Research - Roger Peng et al JHU. Very popular course.
- edX Principles, Statistical and Computational Tools for Reproducible Science - John Quackenbush et al Harvard
- Online course content
- Tools for Reproducible Research - Karl Broman UW, includes resources page
- R for Reproducible Scientific Analysis - Software Carpentry workshop primer using Gapminder data
- R-DAVIS - Student-developed computer literacy and data course in R
- AMIA2019 - Pragmatic RR for Analysis, Dissemination and Publication
- R
- CRAN Task View - Reproducible Research - packages relevant to RCR in R
- liftr - persistent reproducible reporting through containerized R Markdown documents
- repo - provenance framework package
- Open With Binder for Chrome or Firefox - open the GitHub repository you are visiting using MyBinder.org
- DVC - DVC tracks machine learning models and data sets
- Reproducible Research with R and R Studio 2013
- Implementing Reproducible Research 2014 - Describes projects: Sumatra, Vistrails, CDE, SOLE, JUMBO, CML, knitr. Content available on OSF.
- The Practice of Reproducible Research 2017 - 31 first person case narratives and intro chapters
- Dynamic Documents with R and knitr 2015
All these repositories assign Digital Object Identifiers (DOIs) to data
- DataCite - 12M+ DOIs registered for 46 allocators. Offers APIs and a metadata schema.
- Data Dryad - curated, metadata-centric, focused on articles associated with published artices, $120 submission fee (various waivers available)
- Figshare - 20 GB of free private space, unlimited public space, >2M articles, >5k projects
- OSF - Project-oriented system with access control and integration with popular tools. Unlimited storage for projects, but individual files are limited to 5 gigabytes (GB) each.
- Zenodo - Allows embargoed, restricted access, metadata support. 50GB limit.
- Jupyter Gallery - Gallery of interesting Jupyter notebooks
- Papers With Code - ML papers with code
- ReScience - Journal dedicated to insilico reproductions and tests of robustness, lives on Github.
- ReplicationWiki - Replication in the social sciences, particularly economics
- FAIRsharing - standards, databases, and policies
- BioPortal - 660 biomedical ontologies
- ResearchObject.org - RO specifications and publications
- BioCompute - BCO specs
- rOpenSci - Tools, conferences, and education
- Open Science Framework - Open source project management
- pyOpenSci - promotes open and reproducible research through peer-review of scientific Python packages
- Awesome Pipeline - So many pipelines frameworks
- Awesome Docker - Everything related to the Docker containerization system
- Awesome R - Section on RR tools
- Awesome Jupyter - Jupyter projects, libraries and resources
Contributions welcome! Read the contribution guidelines first.
To the extent possible under law, Jeremy Leipzig has waived all copyright and related or neighboring rights to this work.