-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
UCO support for Protégé #449
Comments
AJN: This is a *partial* cherry-pick of the commit by @DrSnowbird. The `index.html` file has been removed from this commit, and the rename of the "root" `uco.ttl` file has been reverted, in order to save on Git noise. A follow-on patch will address the two new files. References: * #449 (cherry picked from commit 0747a62) Signed-off-by: Alex Nelson <alexander.nelson@nist.gov>
This patch modifies paths in the catalog file, but does not attempt an import of the UCO co and owl ontologies. For a yet-undiagnosed reason, adding those reference resolutions causes Protégé to fail the load. References: * #449 Signed-off-by: Alex Nelson <alexander.nelson@nist.gov>
PR 450 has been filed to start the implementation for this proposal. Thanks again, @DrSnowbird, for contributing the start of this branch. Unfortunately, a curious issue arose with trying to load the Collections Ontology shape file. There's a chance it will be resolvable with an extra Git submodule based interaction - I'm out of time to test tonight. My current feeling is that the testing infrastructure complexity for this might be too high for integration with the UCO 1.0.0 release, but it is a backwards-compatible change that could be integrated with any 1.x.0. |
This patch adds a Makefile and configuration file to sketch the call pattern for the catalog creation script. More work would need to follow to align the Makefile with the current descent order, but this patch provides enough for the current pass of development and testing. References: * #449 Signed-off-by: Alex Nelson <alexander.nelson@nist.gov>
This implementation pattern should enable every directory with a Turtle file to be able to call the script with the same command pattern, whether in UCO or a downstream ontology. Hard-coded logic moves out of the script, and into maintaining a tab-separated-values file and Make calls to spcify what ontology file to inspect. This way, any individual ontology file can be loaded into Protege if desired. This patch modifies the demonstrated Makefile call pattern. After demonstration by regenerating the catalog XML file for the root `uco.ttl` graph, future patches will generate other catalog files. This patch also removes some erroneously copy-pasted script text from `/ontology/uco/master/Makefile`, and retires the first draft by @DrSnowbird. References: * #449 Signed-off-by: Alex Nelson <alexander.nelson@nist.gov>
With this patch, I can also confirm that opening the `uco.ttl` file in the affected directory worked for me with Protege. The steps to regenerate this file are not yet captured in CI: cd ontology/uco/master make A future patch will add catalog regeneration to CI. References: * #449 Signed-off-by: Alex Nelson <alexander.nelson@nist.gov>
It has been an oversight that prov-o was not included in the transitive closure construction, given its usage with `case_prov`. The oversight became a blocking issue in implementation of UCO Issue 449, which has a solution in draft that requires the transitive closure be present, and DCAT imports PROV-O as a dependency. A follow-on patch will generate PROV-O per this recipe. References: * ucoProject/UCO#449 Signed-off-by: Alex Nelson <alexander.nelson@nist.gov>
The Protege catalog XML construction design involves review of the transitive `owl:imports` closure of UCO's graph files. To do this, RDFLib needs to be available in a Python virtual environment before descending into the `/ontology` directory. Before this patch, a virtual environment with RDFLib was constructed in `/tests` for testing, but not for other ontology-related maintenance. This patch moves the virtual environment construction to the UCO repository root directory, and sets the dependency order of `all` and `check` to include `/venv` being built before descent into `/ontology` and `/tests. All paths referencing `/tests/venv` under `/tests` have been updated. As one bit of code upgrading, the `PYTHON3` selector snippet, originally written before Python 3.10's release, now looks only for the default Python 3 if not supplied when calling Make (e.g. with `make PYTHON3=python3.11 check`). This patch isolates its effects to moving the virtual environment. A follow-on patch will integrate catalog construction using the new placement. References: * #449 Signed-off-by: Alex Nelson <alexander.nelson@nist.gov>
The construction script now handles multiple input ontology files, but with the requirement that they be in the same directory. Interfaces have also been added to handle imported, possibly non-CDO ontology references in two ways: * With a TSV file mapping ontology IRIs or version IRIs to files. * With optional references to (effectively imported) `catalog-v001.xml` files. Another behavior change is implemented: the focus ontologies are now also added to the `catalog-v001.xml` file, in part to support when multiple graph files are in one directory, and in part to support re-consumption of `catalog-v001.xml` by the `catalog-v001.xml` generating script. The rationales for how to handle ontology--file mappings outside the scope of UCO (in both upstream and downstream directions) include: * Symbolic links could have been used to pool all file references into the `/dependencies` directory. Windows users that run `git clone` without symbolic links enabled for their system would encounter significantly counter-intuitive errors. - This also would not iterate well with consumers of the catalog script outside of UCO (e.g. CASE). * A Makefile could have been made to normalize the dependent ontology files into the same Turtle style (or even away from RDF-XML, which the Collections Ontology currently uses as sole format). However, this would again be a point of difficulty for Windows users, as they would have to run `make` to create the files referenced in the catalog XML. * Copying files into a Git repository introduces code-drift issues that are difficult to manage. When the copied files were themselves tracked in Git, this is counter to the purpose of Git submodules. This patch goes on the assumption that Git submodules and recursive cloning are a reasonable minimal requirement to access full local-file ontology interaction. The catalog generating script in this patch state has been tested (offline) with CASE and CASE-Corpora as users, via a submodule chain starting with CASE-Corpora. The `CONTRIBUTE.md` file has also been updated to add usage documentation, and to fix a copy-paste error from some time ago. A follow-on patch will regenerate Make-managed files. References: * #449 Signed-off-by: Alex Nelson <alexander.nelson@nist.gov>
References: * #449 Signed-off-by: Alex Nelson <alexander.nelson@nist.gov>
This commit matches the rationale of UCO commit `66d0c38`, in support UCO Issue 449. The UCO submodule pointer is also bumped due to its own motion of virtual environment resources in the commit noted above. References: * ucoProject/UCO#449 * ucoProject/UCO@66d0c38
A follow-on patch will regenerate Make-managed files. References: * ucoProject/UCO#449
References: * ucoProject/UCO#449 Signed-off-by: Alex Nelson <alexander.nelson@nist.gov>
No effects were observed on Make-managed files. References: * ucoProject/UCO#449 Signed-off-by: Alex Nelson <alexander.nelson@nist.gov>
A follow-on patch will regenerate Make-managed files. References: * ucoProject/UCO#449 Signed-off-by: Alex Nelson <alexander.nelson@nist.gov>
References: * ucoProject/UCO#449 Signed-off-by: Alex Nelson <alexander.nelson@nist.gov>
PR 450 now has an implemented solution for this Issue. The summary of effects is that with the associated PRs merged, UCO works in Protégé wholly with locally-stored (/-versioned) ontology files, and the generation mechanism is confirmed to work for downstream ontologies. I have tested this with CASE and CASE-Corpora (test links are in PR 450). CASE-Corpora is now able to generate Some unexpected developments occurred:
The overall increase in risk is for projects that track UCO and CASE as Git submodules for the sake of re-using their virtual environment and/or monolithic ontology build. The virtual environment motion means some tracking projects will need to update paths to scripts. I'm aware that this will impact the CASE Python Utilities' monolithic build tracking and the documentation engines (CASE's, UCO's) most, but I suggest that overall this is logistically acceptable, as paths only need to be updated once per tracking Git project. |
No effects were observed on Make-managed files. References: * ucoProject/UCO#449 Signed-off-by: Alex Nelson <alexander.nelson@nist.gov>
No effects were observed on Make-managed files. References: * ucoProject/UCO#449 Signed-off-by: Alex Nelson <alexander.nelson@nist.gov>
No effects were observed on Make-managed files. References: * ucoProject/UCO#449 Signed-off-by: Alex Nelson <alexander.nelson@nist.gov>
This patch, normally a brief edit of `version_info.py`, also makes path updates to the virtual environments that were enacted as part of UCO Issue 449. Test updates made for UCO Issue 508 are also forward-ported. A follow-on patch will regenerate Make-managed files. References: * ucoProject/UCO#449 * ucoProject/UCO#508 Signed-off-by: Alex Nelson <alexander.nelson@nist.gov>
This catches a resource move made for UCO Issue 449. References: * ucoProject/UCO#449 Signed-off-by: Alex Nelson <alexander.nelson@nist.gov>
CDO Shape repositories are not necessarily being designed to require UCO as a Git submodule. At least one, which will take UCO's OWL-review shapes into their own repository, will avoid tracking UCO as a submodule in order to prevent circular Git submodule dependencies. The catalog construction script created for UCO Issue 449 is still potentially useful for inspecting shapes for ontologies with one or more `owl:imports` statements, even if UCO is not available via a Git submodule. Hence, this patch copies the catalog construction script from UCO (at version 1.2.0), as part of moving the script out of UCO. The inlined NIST license also receives an update in this patch, and the format review adjusts some syntax style. So, the version of the script receives a bump. No effects were observed on Make-managed files. References: * ucoProject/UCO#449 Signed-off-by: Alex Nelson <alexander.nelson@nist.gov>
CDO Shape repositories are not necessarily being designed to require UCO as a Git submodule. At least one, which will take UCO's OWL-review shapes into their own repository, will avoid tracking UCO as a submodule in order to prevent circular Git submodule dependencies. The catalog construction script created for UCO Issue 449 is still potentially useful for inspecting shapes for ontologies with one or more `owl:imports` statements, even if UCO is not available via a Git submodule. Hence, this patch copies the catalog construction script from UCO (at version 1.2.0), as part of moving the script out of UCO. The inlined NIST license also receives an update in this patch, the format review adjusts some syntax style, and an RDFLib type reference is updated. For these patch-level changes, the version of the script receives a bump. No effects were observed on Make-managed files. References: * ucoProject/UCO#449 Signed-off-by: Alex Nelson <alexander.nelson@nist.gov>
CDO Shape repositories are not necessarily being designed to require UCO as a Git submodule. At least one, which will take UCO's OWL-review shapes into their own repository, will avoid tracking UCO as a submodule in order to prevent circular Git submodule dependencies. The catalog construction script created for UCO Issue 449 is still potentially useful for inspecting shapes for ontologies with one or more `owl:imports` statements, even if UCO is not available via a Git submodule. Hence, this patch copies the catalog construction script from UCO (at version 1.2.0), as part of moving the script out of UCO. The inlined NIST license also receives an update in this patch, the format review adjusts some syntax style, and an RDFLib type reference is updated. For these patch-level changes, the version of the script receives a bump. No effects were observed on Make-managed files. References: * ucoProject/UCO#449 Signed-off-by: Alex Nelson <alexander.nelson@nist.gov>
Disclaimer
Participation by NIST in the creation of the documentation of mentioned software is not intended to imply a recommendation or endorsement by the National Institute of Standards and Technology, nor is it intended to imply that any specific software is necessarily the best available for the purpose.
Background
Protégé is a tool for editing ontologies.
https://protege.stanford.edu/
It is able to open and interact with ontologies stored as local files in a user's desktop environment. It is also capable of resolving all of the ontology imports (encoded as
owl:import
statements), recursively retrieving all ontologies referenced from any loaded ontology.By default, Protégé will do an over-the-wire retrieval on encountering an
owl:imports
statement: The referenced IRI will be downloaded in whatever RDF serialization is offered (seemingly preferringapplication/rdf+xml
). It is possible to include an "Override" XML file that can be interpreted as: "Whenever Protégé encounters this IRI, instead of loading a file from a network retrieval, load a file from this hard-coded relative or absolute path."There is a slight technical matter with the XML file: It must reside in the same directory as the ontology file one would open with Protégé.
Requirements
Requirement 1
UCO should store a Protégé
catalog-v001.xml
file inontology/uco/master/
, enumerating all UCO ontology files.Requirement 2
The
catalog-v001.xml
file's hard-coded enumeration must be tested to be in sync. with UCO's imports.Requirement 3
CASE must provide the same Protégé support as UCO, maintaining its own
catalog-v001.xml
file inontology/master/
, enumerating all CASE and UCO ontology files. While this might seem out of scope of UCO's purview, this requirement is also to ensure UCO can enable any downstream ontology to provide the same support for Protégé that UCO does.Risk / Benefit analysis
Benefits
Risks
catalog-v001.xml
file is necessarily a hard-coded enumeration of resources. Hence, any additions of new.ttl
files would need to be kept in sync to maintain referential integrity within the Protégé application.catalog-v001.xml
in order to locally resolve its imports. Else, Protégé is only compatible with UCO when loaded in its entirely. This would be additional work to maintain.owl:versionIRI
. If so, these catalog XML files would need to be updated with every release that changes theowl:versionIRI
statement.CONTRIBUTE.md
---that Protégé local resolution would only work ifgit submodule update --init
has been run at least once.Competencies demonstrated
Competency 1
A user is interested in using Protégé to load all of UCO's current state in the
develop
branch, which has some changes implemented since the last UCO release.Competency Question 1.1
How does the user see the current version of
observable:File
in thedevelop
branch?Result 1.1
If there is only one
catalog-v001.xml
in UCO -uco.ttl
would need to be opened with Protégé. Then,observable:File
's current state would be viewable through the class navigator.If instead each ontology directory gets a
catalog-v001.xml
-observable.ttl
would need to be opened. The rest is as above.Solution suggestion
catalog-v001.xml
alongsideuco.ttl
, in the directory${top_srcdir}/ontology/uco/master/
.catalog-v001.xml
.Coordination
develop
for the next releasedevelop
state with backwards-compatible implementation merged intodevelop-2.0.0
develop-2.0.0
(N/A)develop
for the next releaseThe text was updated successfully, but these errors were encountered: