Unpaywall

This repository contains an unpaywall python wrapper that downloads metadata and raw_pdf for a given DOI as well as a bash wrapper that runs s2orc-doc2json utility to parse pdfs into jsons.

You need to have Python, Java, and Bash installed on your system in order to use it.

Installation

Begin by cloning the repo, so you can get the required files:

git clone https://github.com/hcss-utils/unpaywall.git
cd unpaywall
git submodule update --init --recursive

In your terminal, you should now be located in your unpaywall folder.

Let's install virtual environment:

Linux/MacOS:

python3 -m venv env
source env/bin/activate

Now let's install dependencies:

pip install -r requirements.txt
pip install -e .
pip install -r s2orc-doc2json/requirements.txt
pip install -e s2orc-doc2json

If this command runs without any error messages, you can then move onto the next step, which is installing Java as well as Grobid server.

Once you have Java installed (look it up in google), run the following scripts:

bash s2orc-doc2json/scripts/setup_grobid.sh 
bash s2orc-doc2json/scripts/run_grobid.sh # after 87% it's not stuck - you could use grobid already

See s2orc-doc2json for more information.

Usage

Update lens-scopus-wos.csv. Then execute run.sh to parse pdfs into json (make sure you still have grobid running in another terminal tab):

cd scripts
bash run.sh

Name		Name	Last commit message	Last commit date
Latest commit History 42 Commits
data/processed		data/processed
s2orc-doc2json @ ac48dc8		s2orc-doc2json @ ac48dc8
scripts		scripts
unpaywall		unpaywall
.gitignore		.gitignore
.gitmodules		.gitmodules
README.md		README.md
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Unpaywall

Installation

Usage

About

Releases

Packages

Contributors 3

Languages

hcss-utils/unpaywall

Folders and files

Latest commit

History

Repository files navigation

Unpaywall

Installation

Usage

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages