crossref-works.json.xz
is not tracked due to large file size (7.4 GB), which exceed the GitHub LFS max size of 2 GB. Instead the file is available on figshare.
If you use this file, please cite https://doi.org/10.6084/m9.figshare.4816720 (or even better the version-specific DOI).
crossref-works.json.xz
is an xz-compressed file of exported works from MongoDB.
It was created using mongoexport
and can be imported into MongoDB using mongoimport
.
The file as a whole is not actually valid JSON. However, each line of the file is valid JSON and encodes a single work retrieved from the Crossref API.
Accordingly, you can read this file without mongoimport
by splitting at newlines and parsing each line as JSON.
From this directory, run the following commands to download the figshare datasets and check their integrity:
# Download crossref-works.json.xz from figshare
# File URL from https://api.figshare.com/v2/articles/4816720
wget --no-clobber \
--output-document=crossref-works.json.xz \
https://ndownloader.figshare.com/files/7985110
# Verify SHA-256 checksums
shasum --algorithm 256 --check checksums-sha256.txt
checksums-sha256.txt
was created using the following command:
# Create SHA-256 Checksums (contributors only)
shasum --algorithm 256 *.xz > checksums-sha256.txt