Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

N-triples parser not in line with N-triples specification #1276

Closed
csae8092 opened this issue Mar 9, 2021 · 1 comment
Closed

N-triples parser not in line with N-triples specification #1276

csae8092 opened this issue Mar 9, 2021 · 1 comment

Comments

@csae8092
Copy link

csae8092 commented Mar 9, 2021

while trying to parse an n-triples with rdflib version 4.2.2

rdflib.Graph().parse(data='<https://arche-curation.acdh-dev.oeaw.ac.at/api/8458> <https://vocabs.acdh.oeaw.ac.at/schema#hasIdentifier> <make\\u0020me> .', format='nt')

an error is thrown:

Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/rdflib/plugins/parsers/ntriples.py", line 140, in parse
    self.parseline()
  File "/usr/lib/python3/dist-packages/rdflib/plugins/parsers/ntriples.py", line 195, in parseline
    object = self.object()
  File "/usr/lib/python3/dist-packages/rdflib/plugins/parsers/ntriples.py", line 228, in object
    objt = self.uriref() or self.nodeid() or self.literal()
  File "/usr/lib/python3/dist-packages/rdflib/plugins/parsers/ntriples.py", line 235, in uriref
    uri = self.eat(r_uriref).group(1)
  File "/usr/lib/python3/dist-packages/rdflib/plugins/parsers/ntriples.py", line 210, in eat
    raise ParseError("Failed to eat %s at %s" % (pattern.pattern, self.line))
rdflib.plugins.parsers.ntriples.ParseError: Failed to eat <([^:]+:[^\s"<>]+)> at <make\u0020me> .

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python3/dist-packages/rdflib/graph.py", line 1043, in parse
    parser.parse(source, self, **args)
  File "/usr/lib/python3/dist-packages/rdflib/plugins/parsers/nt.py", line 26, in parse
    parser.parse(f)
  File "/usr/lib/python3/dist-packages/rdflib/plugins/parsers/ntriples.py", line 142, in parse
    raise ParseError("Invalid line: %r" % self.line)
rdflib.plugins.parsers.ntriples.ParseError: Invalid line: '<make\\u0020me> .'

The traceback suggests the object-URI has to match the <([^:]+:[^\s"<>]+)> regex which is not in line with the n-triples specification (8th statement of the https://www.w3.org/TR/n-triples/#n-triples-grammar) which doesn't require the IRIREF to contain a semicolon and allows it to contain unicode escape sequences.

@ashleysommer
Copy link
Contributor

ashleysommer commented Mar 10, 2021

This is a duplicate of #1245

In 2004 the first example of an NTriples grammar was published in the RDF-testcases v1.0 2004 document here: https://www.w3.org/TR/2004/REC-rdf-testcases-20040210

A python reference implementation of that grammar was written by the W3C and became incorporated into RDFlib. That grammar became widespread and went on to become what we call NTriples 1.0.

In 2014 the NTriples v1.1 format specification was published: (the one you linked to: https://www.w3.org/TR/n-triples/)
but nobody has yet written a v1.1-compliant NTriples Parser for RDFLib.

This was referenced May 21, 2021
@ghost ghost locked and limited conversation to collaborators Dec 25, 2021
@ghost ghost converted this issue into discussion #1557 Dec 25, 2021

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants