Skip to content

Commit

Permalink
post: additional data harvested from regensburg
Browse files Browse the repository at this point in the history
  • Loading branch information
cbroschinski committed Feb 6, 2024
1 parent 68197de commit 567ef54
Show file tree
Hide file tree
Showing 5 changed files with 162 additions and 0 deletions.
110 changes: 110 additions & 0 deletions Rmd/2024-02-06-regensburg.Rmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,110 @@
---
layout: post
author: Christoph Broschinski
author_lnk: https://github.com/cbroschinski
title: Additional APC data harvested from Regensburg University
date: 2024-02-06 07:00:00
summary:
categories: [general, openAPC]
comments: true
---


```{r, echo = FALSE}
knitr::opts_knit$set(base.url = "/")
knitr::opts_chunk$set(
comment = "#>",
collapse = TRUE,
warning = FALSE,
message = FALSE,
echo = FALSE,
fig.width = 9,
fig.height = 6
)
options(scipen = 1, digits = 2)
knitr::knit_hooks$set(inline = function(x) {
prettyNum(x, big.mark=" ")
})
```
An additional batch of APC data has been harvested from the [institutional repository](https://epub.uni-regensburg.de) of Regensburg University, completing the [previous ingestion](https://openapc.github.io/general/openapc/2024/02/02/regensburg/).

Regensburg University Library is in charge of the [central publication fund at University of Regensburg](https://epub.uni-regensburg.de/oa-publizieren.html), which received support by the Deutsche Forschungsgemeinschaft (DFG) under its [Open-Access Publishing Programme](https://www.dfg.de/en/research_funding/programmes/infrastructure/lis/open_access/infrastructure_funding/index.html#4) until 2019.

Contact person is [Dr. Gernot Deinzer](mailto:gernot.deinzer@bibliothek.uni-regensburg.de).

## Cost data

```{r, cache.lazy = TRUE}
#' Download APC spreadsheet from github which requires to Curl installed
download_apc <- function(path = NULL, dir = "tmp", file = "apc_de.csv"){
if(is.null(path)) {
path <- c("https://raw.githubusercontent.com/OpenAPC/openapc-de/master/data/apc_de.csv")
}
dir.create(dir)
download.file(url = path, destfile = paste(dir, file, sep = "/"), method = "curl")
read.csv(paste(dir, file, sep = "/"), header = T,sep =",")
}
my.apc <- download_apc()
my.apc <- my.apc[my.apc$institution == "Regensburg U",]
my.apc <- droplevels(my.apc)
my.apc_new <- download_apc(c("https://raw.githubusercontent.com/OpenAPC/openapc-de/master/data/unireg/new_articles_2024_02_06_additional_set_enriched.csv"))
my.apc_new <- my.apc_new[my.apc_new$institution == "Regensburg U",]
my.apc_new <- my.apc_new[!is.na(my.apc_new$institution),]
my.apc_new <- droplevels(my.apc_new)
```

The new data covers publication fees for `r format(nrow(my.apc_new), big.mark =",")` articles published in Gold OA journals under transformative agreements. Total expenditure amounts to `r sum(my.apc_new$euro)`€ and the average fee is `r sum(my.apc_new$euro)/nrow(my.apc_new)`€.


```{r}
d_frame = data.frame(table(my.apc_new$publisher, dnn="Publisher"))
d_frame = d_frame[with(d_frame, order(-Freq, Publisher)), ]
my.apc_new$publisher <- factor(my.apc_new$publisher, levels = d_frame$Publisher)
df.summary <-cbind(tapply(my.apc_new$euro, my.apc_new$publisher, length),
tapply(my.apc_new$euro, my.apc_new$publisher, sum),
tapply(my.apc_new$euro, my.apc_new$publisher, mean))
colnames(df.summary) <- c("Articles", "Fees paid in EURO", "Mean Fee paid")
knitr::kable(as.data.frame(df.summary), digits = 2)
```

## Overview

With the recent set of harvest data included, the overall APC data for Regensburg now looks as follows:

### Fees paid per publisher (in EURO)

```{r tree_regensburg_2024_02_06_full}
tt <- aggregate(my.apc$euro, by = list(my.apc$publisher), sum)
colnames(tt) <- c("Publisher", "Euro")
treemap::treemap(tt, index = c("Publisher"), vSize = "Euro", palette = "Paired")
```

### Average costs per year (in EURO)

```{r box_regensburg_2024_02_06_year_full, echo=FALSE, message = FALSE}
require(ggplot2)
q <- ggplot(my.apc, aes(factor(period), euro)) + geom_boxplot() + geom_point()
q <- q + ylab("Fees paid (in EURO)") + theme(legend.position="top") + theme_bw(base_size = 18)
q + xlab("Funding period") + ylab("APC")
```

### Average costs per publisher (in EURO)

```{r box_regensburg_2024_02_06_publisher_full, echo = FALSE, message = FALSE}
require(ggplot2)
d_frame = data.frame(table(my.apc$publisher, dnn="Publisher"))
d_frame = d_frame[with(d_frame, order(-Freq, Publisher)), ]
publishers = as.character(d_frame$Publisher[d_frame$Freq > 10])
my.apc_reduced = my.apc[my.apc$publisher %in% publishers,]
q <- ggplot(my.apc_reduced, aes(publisher, euro)) + geom_boxplot() + geom_point()
q <- q + ylab("Fees paid (in EURO)") + theme(legend.position="top") + theme_bw(base_size = 18) + coord_flip()
q + xlab("Publisher (> 10 articles)") + ylab("APC")
```
52 changes: 52 additions & 0 deletions _posts/2024-02-06-regensburg.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
---
layout: post
author: Christoph Broschinski
author_lnk: https://github.com/cbroschinski
title: Additional APC data harvested from Regensburg University
date: 2024-02-06 07:00:00
summary:
categories: [general, openAPC]
comments: true
---



An additional batch of APC data has been harvested from the [institutional repository](https://epub.uni-regensburg.de) of Regensburg University, completing the [previous ingestion](https://openapc.github.io/general/openapc/2024/02/02/regensburg/).

Regensburg University Library is in charge of the [central publication fund at University of Regensburg](https://epub.uni-regensburg.de/oa-publizieren.html), which received support by the Deutsche Forschungsgemeinschaft (DFG) under its [Open-Access Publishing Programme](https://www.dfg.de/en/research_funding/programmes/infrastructure/lis/open_access/infrastructure_funding/index.html#4) until 2019.

Contact person is [Dr. Gernot Deinzer](mailto:gernot.deinzer@bibliothek.uni-regensburg.de).

## Cost data



The new data covers publication fees for 231 articles published in Gold OA journals under transformative agreements. Total expenditure amounts to 498 893€ and the average fee is 2 160€.




| | Articles| Fees paid in EURO| Mean Fee paid|
|:---------------|--------:|-----------------:|-------------:|
|Springer Nature | 174| 396534| 2279|
|Wiley-Blackwell | 55| 98718| 1795|
|IOP Publishing | 1| 2000| 2000|
|MDPI AG | 1| 1642| 1642|



## Overview

With the recent set of harvest data included, the overall APC data for Regensburg now looks as follows:

### Fees paid per publisher (in EURO)

![plot of chunk tree_regensburg_2024_02_06_full](/figure/tree_regensburg_2024_02_06_full-1.png)

### Average costs per year (in EURO)

![plot of chunk box_regensburg_2024_02_06_year_full](/figure/box_regensburg_2024_02_06_year_full-1.png)

### Average costs per publisher (in EURO)

![plot of chunk box_regensburg_2024_02_06_publisher_full](/figure/box_regensburg_2024_02_06_publisher_full-1.png)
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added figure/box_regensburg_2024_02_06_year_full-1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added figure/tree_regensburg_2024_02_06_full-1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit 567ef54

Please sign in to comment.