Make serialize() on a CONSTRUCT result act like normal g.serialize() #1338

nicholascar · 2021-06-11T00:39:14Z

Currently (RDFlib 6.0.0a0 AKA master), g.serialize() defaults to turtle format and is decoded, however calling serialize() on a SPARQL CONSTRUCT result (i.e. on a Result object) returns the RDFlib 5.0.0 form of an RDF/XML, UTF-8-encoded result.

The old style serialize() in Result should be updated to match the main Graph() serialize().

The text was updated successfully, but these errors were encountered:

aucampia · 2021-09-17T21:10:12Z

One slight conundrum here is that the default format was "xml", and this was valid for both tabular and graph results. I'm working on this now and I think the best solution is to default format to None instead, and then if None, and result is graph, use turtle, otherwise use txt.

Feedback on the idea will be welcome, but hopefully I will have a patch soon.

nicholascar · 2021-09-17T22:13:28Z

the best solution is to default format to None instead, and then if None, and result is graph, use turtle, otherwise use txt.

Sounds about right!

And I hope to introduce optional outputs to PANDAS Dataframes across all serialisation options soon as people often want to export results to PANDAS. However such an option will require PANDAS to be imported so has to be one of those optional, additional imports that I don’t quite know how to do but that I’ve seen elsewhere: import rdflib[pandas] or similar.

aucampia · 2021-09-21T22:13:12Z

I have been working on this, but I'm at somewhat of an impasse here, because the serializers are actually quite inconsistent.

Currently the rdf serializers treat stream destinations as IO[str] (jsonld, n3, nquads, nt, rdfxml) while result Result serializers seems like they will work for both IO[str] and IO[bytes], except for txtresults, which will only work for IO[str]. I think the best solution is to make ResultSerializers work for both IO[str] and IO[bytes]. I will use type hints to prohibit the use of encoding when IO[str] is supplied, and mandate encoding when IO[bytes] is supplied.

Any feedback on the matter will be appreciated.

nicholascar · 2021-09-22T00:15:04Z

I think we are seeing the result of a poor transition from Py2 to Py3 here. Presumably I and others missed serialization options for things other than the RDF serializers, i.e. ResultSerializers, and there may be others.

If any consistency can be achieved here, that would be great. I guess we'll then just have to look for other serializers too, perhaps the extra CVS ones, Trig, TriX etc.

aucampia · 2021-09-26T20:19:16Z

I have done some digging to make sense of the IO types in python, and thought a bit how to actually deal with this situation.

Some related inquiries and issues, mostly related to typing:

I think if there was a clean slate, the best option would have been to only accept BinaryIO like buffers (i.e. io.RawIOBase or io.BufferedIOBase), with optional encoding, and if no encoding is supplied, and if the serializer supports multiple encodings, default to system preferred encoding (similar to what TextIOWrapper does).

However, given that some ResultSerializers work with BinaryIO, and some with TextIO, this will break compatibility, so probably the best compromise to maintain interface is for ResultSerializer.serialize() to accept both BinaryIO with optional encoding, and TextIO without any encoding. And then when ResultSerializer.serialize() defers to Graph.serialize(), just use TextIO.buffer and TextIO.encoding and pass that to Graph.serialize().

Further I will also try and ensure the default encoding is utf-8 throughout all encoders.

For formats that only allow one encoding (turtle) I think we should reconsider the behaviour if unsupported encodings are requested. Either we should always support arbitrary encodings, or always raise an exception if an unsupported encoding is supported. The current approach is to warn if an unsupported encoding is requested, and this may result in unexpected behaviour. I will however defer changing this for now, as it can be dealt with in another PR.

nicholascar added the 6.0.x label Jul 21, 2021

aucampia mentioned this issue Sep 21, 2021

Make Result.serialize work more like Graph.serialize #1418

Closed

ghost locked and limited conversation to collaborators Dec 27, 2021

ghost converted this issue into discussion #1612 Dec 27, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

This issue was moved to a discussion.

Make serialize() on a CONSTRUCT result act like normal g.serialize() #1338

Make serialize() on a CONSTRUCT result act like normal g.serialize() #1338

nicholascar commented Jun 11, 2021

aucampia commented Sep 17, 2021

nicholascar commented Sep 17, 2021 •

edited

Loading

aucampia commented Sep 21, 2021 •

edited

Loading

nicholascar commented Sep 22, 2021

aucampia commented Sep 26, 2021 •

edited

Loading

This issue was moved to a discussion.

This issue was moved to a discussion.

Make serialize() on a CONSTRUCT result act like normal g.serialize() #1338

Make serialize() on a CONSTRUCT result act like normal g.serialize() #1338

Comments

nicholascar commented Jun 11, 2021

aucampia commented Sep 17, 2021

nicholascar commented Sep 17, 2021 • edited Loading

aucampia commented Sep 21, 2021 • edited Loading

nicholascar commented Sep 22, 2021

aucampia commented Sep 26, 2021 • edited Loading

This issue was moved to a discussion.

nicholascar commented Sep 17, 2021 •

edited

Loading

aucampia commented Sep 21, 2021 •

edited

Loading

aucampia commented Sep 26, 2021 •

edited

Loading