-
Notifications
You must be signed in to change notification settings - Fork 555
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sort Turtle output #1978
base: main
Are you sure you want to change the base?
Sort Turtle output #1978
Conversation
This patch adds a test to start specifying what sorting Turtle output would look like. This is intended to start discussion about expectations of blank node sorting, and to set an initial interface for triggering sorted output with a propagated keyword argument in `Graph.serialize()`. This patch will fail CI, but should not fail for code-style reasons. The new test script was reviewed with black, flake8, isort, and mypy (--strict). References: * RDFLib#1890 Signed-off-by: Alex Nelson <alexander.nelson@nist.gov>
Something that may be of interest to you: class OrderedMemory(Memory):
def __init__(self, configuration=None, identifier=None):
super().__init__(configuration, identifier)
self.__spo = OrderedDict()
self.__pos = OrderedDict()
self.__osp = OrderedDict()
self.__namespace = OrderedDict()
self.__prefix = OrderedDict()
self.__context_obj_map = OrderedDict() I wanted stable output for test reports, so it is easier to diff them and so they are not just noisy spam when added to git, so I made that. Seems to work, but it depends on order that things are added, and I'm naming blank nodes to make it work. Doing something similar with SortedDict may also work. |
@aucampia - thanks for the strategy suggestion. I just gave After reviewing the code, there's a chance I think one end-result of this analysis is that making consistently-sorting Does this way of thinking sound like it's on the right track? Unrelatedly, I was using |
@ajnelson-nist do you want to pick this up again, now that we have R's being merged again? Also, is this in line with recent W3C work on canonical hashing of RDF graphs: |
+1 would love to have this in the lib, lmk how i can help |
Signed-off-by: Alex Nelson <alexander.nelson@nist.gov>
This patch aligns the type signatures on `Serializer` subclasses, including renaming the arbitrary-keywords dictionary to always be `**kwargs`. This is in part to prepare for the possibility of adding `*args` as a positional-argument delimiter. References: * RDFLib#1890 (comment) Signed-off-by: Alex Nelson <alexander.nelson@nist.gov>
Signed-off-by: Alex Nelson <alexander.nelson@nist.gov>
Signed-off-by: Alex Nelson <alexander.nelson@nist.gov>
Signed-off-by: Alex Nelson <alexander.nelson@nist.gov>
Signed-off-by: Alex Nelson <alexander.nelson@nist.gov>
Signed-off-by: Alex Nelson <alexander.nelson@nist.gov>
I've made some updates to align type signatures down through the Backwards-incompatibility: Meanwhile, I have not had time to look deeply at @nicholascar 's link, and I'm likely to be distracted from it before 2025. @sneakers-the-rat , if you would like to pick this PR up, please either file PRs against the branch in my fork, or file a superseding PR here. I think either's fine, as we can merge each other's branches if there's need to sync up. FWIW, I'm still not well exercised on Poetry, so to keep down on commits flailing against CI I've been using this Makefile. In a fresh clone of this repository, checking out my #!/usr/bin/make -f
# Portions of this file contributed by NIST are governed by the
# following statement:
#
# This software was developed at the National Institute of Standards
# and Technology by employees of the Federal Government in the course
# of their official duties. Pursuant to Title 17 Section 105 of the
# United States Code, this software is not subject to copyright
# protection within the United States. NIST assumes no responsibility
# whatsoever for its use by other parties, and makes no guarantees,
# expressed or implied, about its quality, reliability, or any other
# characteristic.
#
# We would appreciate acknowledgement if the software is used.
SHELL := /bin/bash
all: check
.venv.done.log: \
devtools/requirements-poetry.in
test ! -r venv/bin/pre-commit \
|| ( \
source venv/bin/activate \
&& pre-commit uninstall \
)
rm -rf venv
python3 -m venv venv
source venv/bin/activate && pip install --upgrade pip
source venv/bin/activate && pip install pre-commit
source venv/bin/activate && pre-commit install
source venv/bin/activate && pip install --requirement devtools/requirements-poetry.in
source venv/bin/activate && poetry install --with dev
touch $@
check: \
.venv.done.log
source venv/bin/activate \
&& mypy rdflib test
source venv/bin/activate \
&& pytest -vv test/test_turtle_sort_issue1890.py |
Summary of changes
This patch series starts an API for sorting graph serializations, beginning with Turtle. The main objective is to produce consistent Turtle output, no matter the order of RDF triples being added to a
Graph
.This PR will close Issue 1890.
The initial pair of patches only starts the PR. Some discussion will be needed to design the remaining patches.
The PR's total effects are expected to be additive and preserve backward compatibility.
Checklist
./examples
for new features. (Not yet; unknown what to update. Hard-coded sorted graph in new unit test can be copied to desired location.)CHANGELOG.md
).