-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Retain relative URIs at the RDF model level #62
Comments
@dbooth-boston is it really something to be done on the model? I realize the difficulty of minting all types of URI-s, but isn’t a serialization a better place? Turtle already allows relative URI-s, and what may be enough is to define some processing rules for Turtle (and for other language like JSON-LD) to let the ‘base’ be defined at processing time (or something like that...) |
@iherman , If RDF processing were limited to Turtle-only tools that would always retain relative URIs, then that approach could work. But if we want to allow the full range of RDF tools to be used -- including SPARQL stores -- then those relative URIs need to be retained uniformly across all tools, which to my mind probably means those relative URIs should be respected at the RDF model level. The problem right now is that relative URIs disappear when the RDF is processed: they forever transformed into absolute URIs. |
@dbooth-boston this is by design, as one of the RDF's main strengths are global identifiers. |
@namedgraph , yes I am well aware of that, and it is important. However, there is a significant downside of requiring absolute URIs, as issue #12 explains. IMO the way things are right now, rigid adherence to global identifiers is causing more harm than good. If we had an easy way for people to allocate mnemonic permanent absolute URIs (instead of relative URIs) then that might be a better solution. But so far I have not seen any approach that is easy enough, and many have been proposed. Any approach that requires domain name ownership is too much of a barrier, and approaches like UUIDs are not mnemonic and not guaranteed unique. (In theory UUIDs should be unique enough for most use cases when generated from truly random sources. But in practice most generators are pseudo-random, with unknown entropy, which means you do not really know how unique they are, and that causes FUD.) For debugging, it is important to support URIs that are mnemonic and/or based on natural keys. |
@namedgraph says
I'm right with this. |
If someone had told me many years ago that could I use relative URLs and I did not need to put money down for a DNS name, I would have started on my project then rather than years and years later when I realized I could just use "http://example.com" as my base URL and triplestores would not care. I assumed that I had to shell out money for a DNS before I could even get started. If someone had basically said: "You do not need to have all this infrastructure (web server, dns, sparql enpoint, setup, maintenance, etc.) and you just play in your own little pool until you feel comfortable enough to open things up.", I would have been able to move forward more quickly. Or maybe I am just slow? Either way, I feel relative URIs would make playing with this stuff far easier for people who just want to get started and not have to deal with everything else. |
In my world there is not a reason that would justify not having absolute URIs at the model (not syntax) level, so to me this discussion is moot. |
@namedgraph: I doubt anyone would seriously contest the need for absolute URIs at the model level, and the interpretation of relative URIs relative the the URI of a named graph is an easy way to enforce this. But even if the model spec is done without reference to any particular serialization, it would be a good place to articulate the recommendation that serializations of the RDF data model should support relative URIs as syntactic sugar that automatically resolves into absolute URIs, as this would improve the acceptability of RDF to newbies. |
Out of curiosity, in what way is this not the case currently? N-Triples doesn't; Turtle and JSON-LD do. That's not say work isn't need - the implications of relative URIs are not easy as documents get moved or cached, and the resolution mechanism (RFC 3986 sec 5.1) could be explained in ways more specific to use with RDF documents. |
We have seen a lot of interest in document and base vocabulary relative IRIs in the JSON-LD space. Which is not to say that I personally am looking for this functionality, but to endorse that it has proponents in practice. |
Some options that are available now: |
At the RDF model level, relative URIs currently do not exist: all URIs are absolute. Even though relative URIs are permitted in some RDF serializations, they are converted to absolute URIs during processing, and therefore lost.
URI allocation is a problem in RDF, as explained in issue #12, because allocating permanent absolute URIs is considerably more difficult in practice than in theory. One way to reduce the difficulty of URI allocation would be to allow relative URIs at the RDF model level, so that they are retained during processing. Relative URIs would allow the author to allocate mnemonic URIs based on natural keys, without incurring the up-front burden of assigning permanent absolute URIs. If desired, those relative URIs could be changed later to permanent absolute URIs.
Since relative URIs are only unique within a particular scope of use -- such as a file -- when combining data from different scopes or sources, those relative URIs should be renamed prior to merging RDF data, in a way that ensures continued uniqueness in the merged result. Two possibilities:
Permanent absolute URIs could be assigned prior to merging.
New relative URIs could be assigned prior to merging, by prepending source tags to the old relative URIs. For example, if you are merging data from two sources x and y, then you could prepend "x." or "y." to all relative URIs in those sources (respectively) before merging. Relative URI from source x would become <x.jane> while relative URI from source y would become <y.jane>. This would guarantee continued uniqueness in the merged result.
This approach would require a small change to the RDF standards. Note that this change was previously proposed for RDF 1.1, but not adopted. (The RDF 1.1 charter was tightly constrained for backward compatible with RDF 1.0.)
Tools should also be updated to:
retain relative URIs while processing -- details TBD;
(optionally) rename relative URIs when merging data, by prepending a source tag to each relative URI; and
(optionally) warn when merging data containing potentially conflicting relative URIs.
See also TimBL's Design Issues note on relative URIs
The text was updated successfully, but these errors were encountered: