What is XOAI?
XOAI is the most powerful and flexible OAI-PMH Java Toolkit (initially developed by Lyncode, updated by DSpace). XOAI contains common Java classes allowing to easily implement OAI-PMH data and service providers.
Compliance with the OAI-PMH standard is checked using an included Technology Compatibility Kit, relying on https://github.com/zimeon/oaipmh-validator. Compliance checks are operated on every pull request and push to main branch.
Moving (again): as XOAI is no longer actively maintained by DSpace since 2019, this fork by the Global Dataverse Community Consortium provides an updated version for the needs of and usage with the open source repository Dataverse Software.
This library is available from Maven Central, simply rely on them in your builds:
When building a data provider, you'll add xoai-data-provider:
<dependency>
<groupId>io.gdcc</groupId>
<artifactId>xoai-data-provider</artifactId>
<version>${xoai.version}</version>
</dependency>
When building a service provider, you'll add xoai-service-provider:
<dependency>
<groupId>io.gdcc</groupId>
<artifactId>xoai-service-provider</artifactId>
<version>${xoai.version}</version>
</dependency>
Some minimal usage documentation has been scraped from the DSpace Wiki, mostly explaining the concepts of this library, and put into docs/README.md. It also contains some minimal explanation of this forks special changes. Feel free to extend the documentation, pull requests welcome.
This project uses Spotless Maven Plugin, google-format-java and pre-commit to ensure a standardized and well-formatted codebase.
After cloning the repo, please make sure to install pre-commit
, which will take care of everything for you:
pip install pre-commit
pre-commit install
If you want to run spotless directly, use Maven:
mvn spotless:apply
# - or to just check for consistency use -
mvn spotless:check
- (none)
- (none)
- Catch invalid Base64 encodings for resumption tokens (#272) - a community contribution by @bumann-sbb 💫
- Switch to GDCC Maven Parent POM
- (none)
- (none)
- Do not add empty namespace to XML elements (#240) - a community contribution by @bumann-sbb 💫
- Code coverage button links to a 404 (#214)
- (none)
- TCK now uses Spring 6 and Spring Boot 3
- Do not break UTF-8 multibyte characters in data provider when using
CopyElement
to copy and paste metadata (#188)
- Switching to Java 17 for compilation and testing, but keeping compatibility with Java 11 for JARs
- Switching to Jakarta EE 10 dependencies (For most scenarios, this is not a breaking change.)
- More updated dependencies, Maven plugins, etc
- (none)
- (none)
This is a breaking changes release with a lot of new features, influenced by the usage of XOAI within Dataverse and other places.
- Compatible with Java 11+ only
- Uses java.time API instead of java.util.Date
- Data Provider:
- Changes required to your
ItemRepository
,Item
andItemIdentifier
implementations - Changes required to your
SetRepository
implementation - Changes required to your usage of
DataProvider
(much simplified!) - Renewed configuration mechanism for data provider requires adaption
- Changes required to your
- Service Provider: Changes required to your code using an
OAIClient
, as default implementation changed
- Use the new
CopyElement
orMetadata.copyFromStream()
to skip metadata XML processing, so pregenerated or cached data can be served from yourItemRepository
implementation - Use native JDK HTTP client for OAI requests in service provider, extended with client builder and option to create unsafe SSL connections for testing
- New JDK HTTP client allows to send custom headers, useful for authentication etc
- Add total number of results (inspired by GBIF #8)
- Larger rewrite of how data provider works:
- Enable caching requests by exposing the resumption token to the application and making the pagination of
results more explicit and comprehensible using a new type
ResultsPage
- Extended, simplified and more verbose parameter validation for requests
until
timestamps are tweaked to enable more inclusive requests (avoid spilling milk with database timestamps etc)- Extensible reuse of
RawRequest
andRequest
classes to create non-servlet based endpoints with in-tree verification methods now possible viaRequestBuilder
! - Simplified filtering model for XOAI: easier to setup, default conditions provided
- Enable caching requests by exposing the resumption token to the application and making the pagination of
results more explicit and comprehensible using a new type
- Special XML handling for Dataverse JSON metadata to provide backward compatibility
- Sets now are properly compared, re-enabling
SetRepositoryHelper
to identify available sets - Many new try-with-resources to mitigate memory leak risks
- The StAX XML components have been configured to avoid loading external entities, mitigating potential security risks
from
anduntil
timestamps are now correctly verified in data provider, see #25- Granularity "Lenient" introduced, as the OAI-PMH 2.0 spec allows request with both precisions when "Second" granularity is supported. Former implementation did not allow this - remember to configure this, default is stick with old behaviour!
- Configurable behaviour how to deal with requests where
from
is not afterearliestDate
. Default is to allow such requests, as spec is non-prohibitive. Former implementation behaviour was to deny such requests!
- And more...
See LICENSE or DSpace BSD License