Releases: Symplectic/vivo-harvester-v2
Symplectic Elements Harvester for Vivo v2.1.1
Note: Symplectic is no longer actively developing/maintaining this codebase.
This is the public release of the updated version of the Symplectic Elements Vivo Harvester that Symplectic has used internally for Vivo projects, it is based on our earlier open source version (https://github.com/Symplectic/vivo) but has been extensively reworked to address various performance and process complexity issues. Amongst other features it provides:
-
Support for the current Elements API endpoint specification (v5.5).
-
Significantly improved overall performance:
- Delta based pull of data from the Elements API.
- Efficient population of intermediate TDB triple stores.
- Differential updates of data in Vivo via the Sparql update API
ensuring re-inferencing and re-indexing occur automatically - Ability to compress (gzip) intermediate files in the harvester's internal caches.
-
Other New Features:
- Support for Elements "profile privacy" controls.
- Ability to transfer Elements Group/Group membership information to Vivo.
including extensive control over which Elements groups are sent to Vivo
-
Updated default crosswalks:
- Updated Publication mapping:
- Updated default configuration to allow use of "Web of Science" derived data
including relevant attribution within Vivo - Mapping of URL's for associated RT1 and RT2 repository items.
- External persons are now mapped as "VCard" objects linked to context objects.
instead of creating random user profiles within Vivo - Type of context objects now based on translation context.
e.g. presentation -> core#PresenterRole not core#Authorship
- Updated default configuration to allow use of "Web of Science" derived data
- Updated User mapping:
- Ability to provide an "internal class" which will be attributed to internal users and groups in Vivo
can be set externally via XSLT parameter - Better mapping of "overview" text to preserve linebreaks and html links.
- Mapping of "Research Interests", "Teaching summary" and "Postgraduate Employment" profile fields.
additional mappings for "certifications" although these are incomplete and are disabled by default - Improved handling of "known as" and "preferred user names" where available.
- New mappings for ORCID, Researcher ID and Scopus Author IDs when connected to an Elements API running the v5.5+ endpoint specification.
- Support for "institutional email privacy" controls.
- Ability to provide an "internal class" which will be attributed to internal users and groups in Vivo
- Updated Label mapping:
- Improved attribution for "issn-inferred" labels (only assigned to journal object).
- Ability to map Elements labels onto other Vivo properties.
e.g. research-areas, geographic focus (via lookup table), arbitrary "Concept" lists
- Other mapping updates:
- Addition of translations for groups & group memberships.
- Additional mappings for more "User-Grant" link types.
- Mapping added for "Fellowship" Professional Activity objects.
- Behavioural improvements:
- Ability to select target object type within Vivo based on "record field" data.
- Ability to place "verified-manual" sources higher in precedence order than normal "manual" ones.
- Improved handling of precedence ordering for "journal" objects.
- Usability Improvements:
- Centralised core configuration options to one file (elements-to-vivo-config.xml).
- Ability to start items in the config file with
$$base-uri$$ or$$local$$ and have the crosswalks inject the configured base-uri or local ontology uri. - Crosswalks restructured to include "override" files where templates or functions can be overridden without changing the originals.
- Additional changes to provide "customAdditions" breakout points in various places.
- Bugfixes and general improvements:
- Better handling of "visible" and "invisible" relationships for different categories of data.
- Fixes to the "Membership" and "Distinction" Professional Activity mappings to ensure they are processed as intended.
- Fixes to ensure various user profile fields (Addresses/Emails/Phone Numbers/Web Addresses) are processed correctly.
- General improvements to more accurately map into the Vivo ontology.
- Various efforts to standardise and improve general crosswalk behaviour
- Updated Publication mapping:
-
Updated third party components:
- Decoupled from the UFL Harvester codebase.
- Uses newer Apache HTTP libraries
can now connect to source API's using SNI with https - Uses a newer version of Saxon XSLT engine.
These features combine to provide a robust platform capable of automatically feeding Vivo with regular updates from Elements (e.g. nightly), whilst minimising the impact on the source system's API.
Compatibility
The Harvester is technically compatible with any version of Elements that supports the v5.5 API Endpoint Specification and any version of Vivo that supports the Sparql Update API.
The included Default Crosswalks generate RDF that targets the VIVO-ISF ontology (post v1.6 reorganisation).
They were primarily developed against Vivo systems running various point releases of v1.8 and v1.9.
Documentation
Symplectic Elements Harvester for VIVO _ Description & Overview .pdf
Symplectic Elements Harvester for VIVO _ Installation Guide.pdf
Symplectic Elements Harvester for VIVO _ Crosswalk Development Guide.pdf