-
Notifications
You must be signed in to change notification settings - Fork 555
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sorting Turtle output? #1890
Comments
Would be happy with such an option, and it would be useful with other formats also. My only concern is that our current approach for adding options (i.e. using **kwargs) is a bit adhoc - it would be nice to consider an option that provides validation and type safety somehow. |
Assuming you mean validation of URIs, informed opinion on stackoverflow is that URI validation is best left to a dedicated library so that would suggest the necessity of acquiring a dependency but I guess that can be handled via the pip |
No I mean that provides a way to validate the options independently, maybe something like: plugins.get("json-ld", Serializer).make_opts(...) -> JSONLDOptions and then make it possible to pass serializer options into serialize not entirely sure it is that critical though. |
Ah, so. Yes, that would address the needs of those wishing to exert more control over the JSONLD serializer. |
Interesting. I had thought that typing for keyword arguments would be compatible with a The following program runs fine through Python, print #!/usr/bin/env python3
# This software was developed at the National Institute of Standards
# and Technology by employees of the Federal Government in the course
# of their official duties. Pursuant to title 17 Section 105 of the
# United States Code this software is not subject to copyright
# protection and is in the public domain. NIST assumes no
# responsibility whatsoever for its use by other parties, and makes
# no guarantees, expressed or implied, about its quality,
# reliability, or any other characteristic.
#
# We would appreciate acknowledgement if the software is used.
from typing import Any
def g(x: int, *args: Any, y: str = "", **kwargs: Any) -> None:
if y == "a":
print(x)
def f(x: int, *args: Any, **kwargs: Any) -> None:
g(x, *args, **kwargs)
f(1)
f(2, True)
f(3, True, y="a")
f(4, True, y=True)
f(5, True, y=77)
f(6, True, y=b"c") If another keyword parameter is added to def f(x: int, *args: Any, z: str = "", **kwargs: Any) -> None:
g(x, *args, **kwargs) and adding this last call for f(7, True, y=b"c", z=4444) Summarizing for my original request: if the "front"-most Or, I'm totally missing something else with how the serializer plugin framework handles things. Please do note if that's so, I haven't reviewed that code in depth. |
This patch adds a test to start specifying what sorting Turtle output would look like. This is intended to start discussion about expectations of blank node sorting, and to set an initial interface for triggering sorted output with a propagated keyword argument in `Graph.serialize()`. This patch will fail CI, but should not fail for code-style reasons. The new test script was reviewed with black, flake8, isort, and mypy (--strict). References: * RDFLib#1890 Signed-off-by: Alex Nelson <alexander.nelson@nist.gov>
Some work I'm doing would benefit from seeing this implemented. So, to resume this conversation, I've filed PR 1978. I have a general question on the API for I've personally found it difficult to understand when arguments in functions are specified as positional with defaults, versus being keyword arguments, when there isn't a In case the motivation of this question is unclear, Patch 2 in PR 1978 tries adding The PR also leaves open some question of what would be expected of some sort-order matters. Blank node sorting seems to be a tricky topic that may involve recursion to solve. Prefixes that sort differently from expanded IRIs isn't in the unit test, but will also be worth clarifying with the hard-coded graph serialization. Feedback is welcome. |
FYI you might be interested in https://github.com/tgbugs/pyontutils/tree/master/ttlser and https://github.com/tgbugs/pyontutils/blob/master/ttlser/docs/ttlser.md. |
@tgbugs Thank you for the reference! I see it is in a many-utilities repository. Is your vision that that code be something you continue to maintain in your code base, or instead to be something that integrates into, or overrides, the associated PR 1978? I'm fine with not reliving the whole "Hack, hack, oh we need a spec." experience, and I knew at least blank node handling was going to need some degree of style spec. |
@ajnelson-nist The spec is stable, and there are a bunch of tests that I have written to ensure sane behavior. I intend to upstream as much of the code in ttlser as possible. The implementation that I have is almost completely configurable using I can also say that there are many hidden gotchas that any new implementation of such functionality is likely to re-encounter. |
This patch aligns the type signatures on `Serializer` subclasses, including renaming the arbitrary-keywords dictionary to always be `**kwargs`. This is in part to prepare for the possibility of adding `*args` as a positional-argument delimiter. References: * RDFLib#1890 (comment) Signed-off-by: Alex Nelson <alexander.nelson@nist.gov>
This patch aligns the type signatures on `Serializer` subclasses, including renaming the arbitrary-keywords dictionary to always be `**kwargs`. This is in part to prepare for the possibility of adding `*args` as a positional-argument delimiter. References: * #1890 (comment) Signed-off-by: Alex Nelson <alexander.nelson@nist.gov>
Hi all, probably esp. @nicholascar,
PR 1425 added the
ttl2
serializer, which since becamelongturtle
. One of the goals was to reduce Git difference noise.I unfortunately find that most of the noise in Git diffs comes from Turtle output not being sorted. This is an effect both on nodes with IRIs and with blank nodes that (if I'm guessing right) may be sorting based on their skolemized IDs.
Could an option to either of the Turtle serializers be added to sort the output, for those willing to pay the
.serialize
time cost? I personally like Git-tracking demonstration results, so in many of my uses I'm willing to pay the time cost of sorting graphs.The text was updated successfully, but these errors were encountered: