Skip to content

Latest commit

 

History

History

mongo-export

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 

Exported MongoDB JSON files

Files

License: CC0 1.0

crossref-works.json.xz is not tracked due to large file size (7.4 GB), which exceed the GitHub LFS max size of 2 GB. Instead the file is available on figshare. If you use this file, please cite https://doi.org/10.6084/m9.figshare.4816720 (or even better the version-specific DOI).

Format

crossref-works.json.xz is an xz-compressed file of exported works from MongoDB. It was created using mongoexport and can be imported into MongoDB using mongoimport. The file as a whole is not actually valid JSON. However, each line of the file is valid JSON and encodes a single work retrieved from the Crossref API. Accordingly, you can read this file without mongoimport by splitting at newlines and parsing each line as JSON.

Downloading & Checksums

From this directory, run the following commands to download the figshare datasets and check their integrity:

# Download crossref-works.json.xz from figshare
# File URL from https://api.figshare.com/v2/articles/4816720
wget --no-clobber \
  --output-document=crossref-works.json.xz \
  https://ndownloader.figshare.com/files/7985110

# Verify SHA-256 checksums
shasum --algorithm 256 --check checksums-sha256.txt

checksums-sha256.txt was created using the following command:

# Create SHA-256 Checksums (contributors only)
shasum --algorithm 256 *.xz > checksums-sha256.txt