Utilities to work with the CERN OAIS artifacts, such as Submission Information Packages.
Validates the folder in the given path according to the CERN SIP specification, following these steps:
- Verify directory structure
- Validate the manifest file against the desired sip JSON schema. By default uses sip-schema-d1.json, also shipped in this package
- Validate the folder as a BagIt package
- file are allowed to be missing if the manifest specifies it's a "lightweight" SIP.
- Checks if every content file mentioned in the manifest is actually present in the payload
Usage:
from oais_utils import validate
validate("name_of_the_sip_folder")
SIP manifest JSON schemas are also shipped and exposed with this package.
To get a python dictionary with the schema short name as keys and the parsed (as python object) schema as value for the corresponding schema name, run:
import oais_utils
schemas = oais_utils.schemas
schemas.keys()
# ['draft1']
schemas['draft1']
# [...]
# (Returns the sip JSON schema "draft1" as parsed python object)
schemas['draft1']['$id']
# https://gitlab.cern.ch/digitalmemory/utils/-/raw/master/oais_utils/sip-schema-d1.json
Install from PyPi
pip install oais-utils
For development, you can clone this repository and then install it with the -e
flag:
# Clone the repository
git clone https://gitlab.cern.ch/digitalmemory/oais-utils
cd oais-utils
pip install -e .
from oais_utils import validate
validate("../bagit-create/bagitexport::cds::2751237")