Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce redundant case_validate analysis #72

Merged
merged 3 commits into from
Aug 24, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 20 additions & 0 deletions case_utils/case_validate/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,7 @@
ValidationResult,
)
from case_utils.case_validate.validate_utils import (
disable_tbox_review,
get_invalid_cdo_concepts,
get_ontology_graph,
)
Expand All @@ -65,6 +66,7 @@ def validate(
input_file: Union[List[str], str],
*args: Any,
case_version: Optional[str] = None,
review_tbox: bool = False,
supplemental_graphs: Optional[List[str]] = None,
**kwargs: Any,
) -> ValidationResult:
Expand All @@ -74,6 +76,7 @@ def validate(
:param *args: The positional arguments to pass to the underlying pyshacl.validate function.
:param input_file: The path to the file containing the data graph to validate. This can also be a list of paths to files containing data graphs to pool together.
:param case_version: The version of the CASE ontology to use (e.g. 1.2.0). If None, the most recent version will be used.
:param review_tbox: If True, SHACL shapes that review OWL Classes, OWL Properties, and SHACL shapes that constrain those classes and properties will be used in the review. Otherwise, those shapes will be deactivated before running validation. Be aware that these shapes are known to significantly increase the validation run time.
:param supplemental_graphs: File paths to supplemental graphs to use. If None, no supplemental graphs will be used.
:param allow_warnings: In addition to affecting the conformance of SHACL validation, this will affect conformance based on unrecognized CDO concepts (likely, misspelled or miscapitalized) in the data graph. If allow_warnings is not True, any unrecognized concept using a CDO IRI prefix will cause conformance to be False.
:param inference: The type of inference to use. If "none" (type str), no inference will be used. If None (type NoneType), pyshacl defaults will be used. Note that at the time of this writing (pySHACL 0.23.0), pyshacl defaults are no inferencing for the data graph, and RDFS inferencing for the SHACL graph, which for case_utils.validate includes the SHACL and OWL graphs.
Expand All @@ -94,6 +97,17 @@ def validate(
# Get the ontology graph from the case_version and supplemental_graphs arguments
ontology_graph: Graph = get_ontology_graph(case_version, supplemental_graphs)

if not review_tbox:
# This is done because, at the time of pyshacl 0.20.0, the
# entirety of the ontology graph is mixed into the data graph.
# UCO 1.0.0 includes some mechanisms to cross-check SHACL
# PropertyShapes versus OWL property definitions. Because of
# the mix-in, all of the ontology graph (.validate ont_graph
# kwarg) is reviewed by the SHACL graph (.validate shacl_graph
# kwarg), so for UCO 1.0.0 that adds around 30 seconds to each
# case_validate call, redundantly reviewing UCO.
disable_tbox_review(ontology_graph)

# Get the undefined CDO concepts.
undefined_cdo_concepts = get_invalid_cdo_concepts(data_graph, ontology_graph)

Expand Down Expand Up @@ -225,6 +239,11 @@ def main() -> None:
help='(ALMOST as with pyshacl CLI) Send output to a file. If absent, output will be written to stdout. Difference: If specified, file is expected not to exist. Clarification: Does NOT influence --format flag\'s default value of "human". (I.e., any machine-readable serialization format must be specified with --format.)',
default=sys.stdout,
)
parser.add_argument(
"--review-tbox",
action="store_true",
help='Enable rules for reviewing OWL Classes, Properties, and SHACL shapes that constrain them (i.e. the "TBox", or "Theorem box", of the data graph and ontology graph; in contrast, the "ABox", or "Axiom box", contains the declarations of members of those classes, and users of those properties). This should be used when adding extension classes or properties not adopted by UCO or its downstream ontologies, e.g. when using a drafting namespace. Be aware that these rules are known to significantly increase the validation run time.',
)

parser.add_argument("in_graph", nargs="+")

Expand All @@ -250,6 +269,7 @@ def main() -> None:
do_owl_imports=True if args.imports else False,
inference=args.inference,
meta_shacl=args.metashacl,
review_tbox=True if args.review_tbox else False,
supplemental_graphs=args.ontology_graph,
**validator_kwargs,
)
Expand Down
19 changes: 19 additions & 0 deletions case_utils/case_validate/validate_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -182,3 +182,22 @@ def get_ontology_graph(
ontology_graph.parse(arg_ontology_graph)

return ontology_graph


def disable_tbox_review(graph: rdflib.Graph) -> None:
l_true = rdflib.Literal(True)
ns_uco_owl = rdflib.Namespace("https://ontology.unifiedcyberontology.org/owl/")

for tbox_shape_basename in {
"DataOneOf-shape",
"DatatypeProperty-shacl-constraints-shape",
"Disjointedness-AP-DP-shape",
"Disjointedness-AP-OP-shape",
"Disjointedness-C-DT-shape",
"Disjointedness-DP-OP-shape",
"ObjectProperty-shacl-constraints-shape",
"ontologyIRI-versionIRI-prerequisite-shape",
"versionIRI-nodeKind-shape",
}:
n_tbox_shape = ns_uco_owl[tbox_shape_basename]
graph.add((n_tbox_shape, NS_SH.deactivated, l_true))
31 changes: 31 additions & 0 deletions tests/case_utils/case_validate/uco_test_examples/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -99,6 +99,37 @@ all: \
rm __$@
mv _$@ $@

# NOTE - this more-specific recipe enables "tbox" review, but otherwise
# matches the wildcarded recipe.
owl_properties_XFAIL_validation.ttl: \
$(examples_srcdir)/owl_properties_XFAIL.json \
$(tests_srcdir)/.venv.done.log \
$(top_srcdir)/.ontology.done.log \
$(top_srcdir)/case_utils/case_validate/__init__.py \
$(top_srcdir)/case_utils/case_validate/validate_types.py \
$(top_srcdir)/case_utils/case_validate/validate_utils.py \
$(top_srcdir)/case_utils/ontology/__init__.py
source $(tests_srcdir)/venv/bin/activate \
&& case_validate \
--allow-warnings \
--debug \
--format turtle \
--review-tbox \
$< \
> __$@ \
; rc=$$? ; test 0 -eq $$rc -o 1 -eq $$rc
@#Fail if output is empty.
@test -s __$@ \
|| exit 1
java -jar $(RDF_TOOLKIT_JAR) \
--inline-blank-nodes \
--source __$@ \
--source-format turtle \
--target _$@ \
--target-format turtle
rm __$@
mv _$@ $@

check: \
$(validation_ttls)
source $(tests_srcdir)/venv/bin/activate \
Expand Down
Loading