This repository provides data and code used for the preprint
Najko Jahn, 2024. How open are hybrid journals included in transformative agreements? https://arxiv.org/abs/2402.18255
This repository is organized as a research compendium. A research compendium contains data, code, and text associated with it.
The analysis/
directory contains the manuscript written in R Markdown:
The R Markdown is rendered to a Latex document. See the rendered pdf here:
{renv} is used to create an reproducible environment for all the R packages used in the analysis.
Data is openly available through an R data package, {hoaddata}. {hoaddata} contains not only the datasets used in the data analysis. It also includes code used to compile the data by connecting it to a cloud-based Google Big Query data warehouse, where scholarly big data from Crossref, OpenAlex and Unpaywall were imported. To increase computational reproducibility, data aggregation through hoaddata was automatically carried out using GitHub Actions.
The main dataset, providing article-level data about publications linked to transformative agreements and institutions, is available as {hoaddata} release asset: https://github.com/subugoe/hoaddata/releases/download/v0.2.91/ta_oa_inst.csv.gz
To install {hoaddata} version used with R
library(remotes)
remotes::install_github("subugoe/hoaddata@v0.2.91", dependencies = "Imports")
All data underlying figure can be found in analysis/fig_data/
folder.
Najko Jahn, Data Analyst, SUB Göttingen. najko.jahn@sub.uni-goettingen.de