-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Property Graphs #45
Comments
I also think support for property graphs is very important. However, my strong hope is that we adopt a mechanism for n-ary relations that subsumes property graphs as a special case, so that we do not need a separate mechanism. So far I have not seen any big barriers to such an approach. My 2 cents on some of your questions:
Agreed. And I find myself recoiling in horror at the mere mention of reification. In my view, RDF reification should be deprecated, since named graphs are generally much better, though not needed for property graphs.
Then a URI should be used, consistent with existing RDF practice.
Although that could be done in existing TriG (for example) I do not think it should be supported in a new higher-level RDF language. I think an RDF molecule that represents an n-ary relation should exist entirely in each graph where it is used, and should be considered malformed if one tries to put part of it in one graph and part in another. The reason is that the user, by creating it as an n-ary relation, intended it to be treated as a single unit. However, there would be nothing wrong with asserting some new triples or a new n-ary relation that makes use of some of the constituents of another n-ary relation.
My gut feeling is that that should be done by attaching additional metadata triples to the graph URI, such as provenance.
Yes. My assumption is that by coming up with a standard way to define n-ary relations, this ability will fall out as a natural consequence: a particular group of triples will be automatically identifiable as an n-ary relation comprised of those properties.
Interesting idea! I wonder how the scope could be known, so that the interpretation would be stable in the face of changing data. It would be bad if x.foo were to select one property against one set of data, but a different property if more data were added. Anyone have thoughts on how this could be done? |
I think that is tied to the cardinality of the property, i.e. whether "foo" is constrained to a singular value or can have multiple values (via multiple links with the same subject and predicate). Following the given path may thus return a set of nodes containing zero, one or multiple nodes. When we look at how to model n-ary chunks, we should also look at associated metadata including cardinality constraints, composite keys and so forth. What metadata would make data and rules easier to use by the vast majority of developers? Path following is related to regular expressions and RDF shapes, as well as to XPath for XML. I've explored it in some experiments inspired by ATNs, see https://www.w3.org/WoT/demos/shrl/test.html p.s. I am using the term chunk as it is popular in Cognitive Science and features prominently in cognitive architectures like CMU's ACT-R. |
If someone wrote x.foo as a path, using short names, then I assume that each corresponding long name would be comprised of a namespace plus the short name. How would the system know which namespace to prepend to the short name? For example, if the current namespaces included both http://example/a# and http://example/b#, how would the system know whether foo should be expanded to http://example/a#foo or http://example/b#foo? Or do you envision this working some other way? |
No, that isn't the case. This is just a graph of objects where the object properties act as links to other objects, and each object property has a name that is scoped to that object. In RDF terms, the subject node + the property name provides a map to a predicate, and uniquely identifies a set of triples with that subject and predicate. A restriction on this would be to constrain property names to uniquely identify predicates in this graph. This is tantamount to saying that the property name uniquely identifies the meaning of a property, rather than this being something specific to each object. That is an overly strong constraint as in the real world, words are often used for different meanings depending on the context. However, there is nothing to prevent implementations from optimising how they handle this internally. |
I would like to pursue the possibility of encoding property graphs in standard RDF. Have others already done this? If so, what RDF patterns were used, and what limitations did they have? |
Apart from reification, one approach that has been mentioned is to use a named graph that contains just the triple you want to annotate. This generalises to annotations on multiple triples, but I am unsure how you indicate that a given triple is in multiple named graphs. Another challenge is how you identify a graph when there isn't an explicit name for it, e.g. when using curly braces in Turtle* around the triples you want to annotate, this would imply an implicit blank node for the associated graph. This makes me think about how to deal with graphs from an implementation perspective. One idea is to express the relationship between a triple and a graph is as a property of the triple, where the property can have multiple values. Another idea is to allow for relationships between graphs, e.g. for one graph to be subsumed as part of another graph. A database could create its internal identifiers, and associate them with external identifiers when those are defined. I wonder how this is dealt with by existing property graph database solutions? |
You make several quads having the same <s,p,o>. |
The part in bold is not true. Node and Link (respectively Vertex and Edges) properties are plain old hashmap, JSObject or
Yes.
What reification? I looked up around I still don't understand.
That is exactly what I meant about "it is advanced use" in the this comment.
I think we should come up with a representation of a property graph before trying to generalise to recursive or hierarchical graph or "meta-graph". True story: as part of a foolish tentative to replace the atomspace, I was thinking about how to implement this kind of things. Basically a single entity called the
what is "n-ary relations" please?
That is what Gremlink (from Thinkerpop) mostly does. It is written like graph.vertices.filter(lambda x: x.type == 'actor').outgoing.filter(lambda x: s.genre 'science-fiction') |
I have written a some time ago an article on how to build a graph database on top of EAV. You can find it at https://hyper.dev/blog/diy-graph-database-in-python.html. EAV is somewhat like a triplestore but you can not have multiple triples with the same |
Hello, I'd like to pick up this topic and discuss a specific question: How can you distinguish a property from a relation ? In RDF that is not possible, because there is no such distinction. Example:
and
are completely equal in the sense that they are simple statements. But the meaning is very different because Alice and Bob are persons, they are entities i.e. they are things (resources) which have distinct existence.
Of course, you can say that a mailbox is also an entity but wether or not something is an entity is a decision made by the domain model. I think this is the crucial question when you want to bring Property Graph and RDF together! I also want point out, that Property Graph is a technical way to do ER modelling. My vision is to create a unified graph model the embraces ER-modelling and RDF at once. |
In RDF you distinguish between URI resources as objects or datatype resources as objects. Absolute terms like "not possible" do not help IMO because while it may seem so to you coming from a different background, there are very good reasons why RDF is like it is, and formal theory behind them. RDF was designed for data interchange. Are you familiar with RDF-star? |
Sure, I know all that, and I am familiar with RDF*. Of course you can write
We can agree that a literal is property value. But it can be more difficult than that:
What about |
Entity is not an RDF term. If we're talking ontological modeling, a related term would be class. Sure you can call the price entity, and the euro as well. Why is that a problem? |
After re-reading some of this thread, I notice that I missed a couple of questions from @amirouche a couple years ago. Sorry!
See this brief explanation and this answer on stackoverflow.
See Defining N-ary Relations on the Semantic Web. And addressing newer comments from @mhedenus :
Can you please first explain what distinction you are trying to make between a "property" and a "relation"? AFAIK we do not have widely accepted standard definitions of those terms that clearly distinguish between them. If you could explain what distinction you are trying to make, it would be helpful. Also, please explain what you mean by "entity", and why you think some things should be considered entities and some should not. When you wrote "they are entities i.e. they are things (resources) which have distinct existence" it sounds like you are using the term "entity" to mean what RDF calls a "resource". But then when you suggest that some things should be considered entities and some should not, that sounds different than the RDF notion of "resource", so I am confused. Can you explain what you mean by "entity" and how it is different from what RDF calls a "resource"? |
Maybe I should clairfy what it is all about. I am a advocate of RDF since I learned about it 20 years ago (I do programming since 1988 and Java development since 2000). I worked very hard to establish RDF as technology in my company in the automotive industry. Currently, we use RDF primarily for data integration. But can you use RDF for modelling, e.g. using RDFS or OWL ? I think not. The reality is: modelling is hard especially because domain experts are normally not software developers. When you start talking about URIs, resources and stuff they only understand blah blah blah. What people understand (even mechanical engineers) is ER modelling. They understand that there are things (entities or objects) which have properties (or attributes) and they have relations to other things. Let's make a (over-)simplification here: there are two main graph modelling worlds:
Can these worlds be brought together? Yes. We have developed a graph model that is a Property Graph compatible with RDF. |
RDF is talking about resources. Everthing that can be identified with an URI is a resource.
So far so good. Now let's express the fact that "Alice has a email address":
That is legal in RDF and it makes complete sense in RDF. But these statements have different meanings which are only obvious to human readers. In the first statement the mailto URI is an identifier for something we call Alice, in the second statement the same URI is a value that belongs to a property owned by Alice. Do you agree? |
You have used the same URI, But what does this have to do with implementing property graphs in RDF? I don't understand where you're going with this example. |
Yes, this URI collision is not nice, and should be avoided. This example should demonstrate what I think to be the stumble when you try to implement property graph in RDF. When you try to map RDF to property graph you have to know wether the statement's object is another node (and therefore the predicate a relation) or a property value (and therefore the predicate a property type or key). To say all URIs are mapped to nodes in the property graph and ONLY statements with literals are properties would be an artifical restriction. To solve this some addtional information is required that tells you which predicates are considered to be relations and which predicates are considered to be properties. |
I still don't get why developer's unfamiliarity with a technology is being framed as defficiency of the technology, and not the developer. This seems to be a constant theme for EasierRDF. Many more developers know Javascript than C++. Does that make C++ academic, and by that somehow defficient? Should we have EasierC++? If developers are familiar with ER or UML or whatever, then provide mappings/converters to OWL/RDF(S). But don't use that as an opportunity to knock RDF. |
@namedgraph I think I disagree with you fairly fundamentally about this. I think lack of uptake can be an important indicator that a technology is too hard to use. It certainly is not an absolute determinant though. If you look at market shares, RDF databases are getting clobbered by property graph databases. You can claim that RDF does more than what Property Graphs can do -- and I agree -- but it isn't a huge difference, and apparently it isn't a difference that matters to many common use cases. I want to improve RDF, not knock it. And that means being honest about its strengths and weaknesses. IMO its biggest weakness is its difficulty of use. If we can make it as easy to use as Property Graphs -- at least for use cases that do not need functionality beyond Property Graphs -- then I think that would be very beneficial for RDF. But as I said before "my strong hope is that we adopt a mechanism for n-ary relations that subsumes property graphs as a special case, so that we do not need a separate mechanism". |
@dbooth-boston we've been over this... I'd like you to try the C++ analogy though. StackOverflow is full of questions "why is C++ so hard?" and yet some of the most critical software is written in it. How is this different from RDF? |
This is a bit off topic, but I'll indulge your C++ analogy and try to answer. I think you are suggesting that, even though RDF is hard, it is still the right tool for the job sometimes, just as C++ is the still right tool for the job sometimes, even though it is hard. I definitely agree that RDF is sometimes the right tool for the job. (I would not have been involved with RDF for so many years if I didn't!) But here is where I think the analogy breaks down. When C++ is chosen, almost invariably the overriding reason is for performance. I don't believe anybody would choose C++ over Python (for example), if performance were not a key consideration. And the reason C++ is hard is because it is both a low-level C-compatible programming language and a high-level object-oriented programming language. When performance is critical, there is no getting around the need for a low-level language like C. One could of course use C instead of C++, but the higher-level features of C++ allow for more programmer productivity while still giving access to the low-level features of C. In other words, programmers put up with C++'s difficulty because they NEED have the low-level features that it provides. In contrast, I do not believe that RDF is chosen because developers really NEED the low-level features that it provides. I believe we can produce a higher-level successor to RDF, that retains the power that we need, while making it easier to use. As a case in point, I do not believe that we really NEED explicit blank nodes in RDF, i.e., blank nodes like _:b42 that cannot be represented by square brackets [] in Turtle. We could solve the same use cases if RDF did not have them, even though we might have to create a few Skolem URIs instead sometimes. Yet that one little feature -- the ability to write an explicit blank node -- places a disproportionate complexity burden on RDF users. Not only does that feature cause endless confusion to new RDF users (because blank node labels are not stable identifiers), but it is precisely the reason why, after over 20 years, we still do not have a standard way to canonicalize RDF! In short, the low-level features of C++ are essential to its users, but the low-level features of RDF are not essential. They only continue to exist because we have not yet developed a higher-level, easier-to-use successor. Unless we succeed in making RDF considerably easier to use, I think RDF will eventually get squeezed out of the picture entirely, in favor of other graph approaches that are easier to use, even though those other graph approaches are not quite as powerful. |
But are mixing up different things now... |
@mhedenus Entity–relationship model - Limitations:
|
@namedgraph : this is a another more philosophical discussion. The inventor of the ER model Chen regarded it as the fundamental model of everything (and I agree). We use RDF for integrating data from very different datasources. To do so each datasource must provide its data as RDF. Here is the crucial point, because that is not working well for very different reasons. |
@mhedenus wrote:
Yes. RDF URIs are akin to words in English and other natural languages, in that they are important for semantic interoperability between communicating agents. Internally, agents need to be able to create IDs for chunks generated on the fly. In the human brain, chunks correspond to semantic pointers in noisy high dimensional spaces (the concurrent firing patterns across cortical columns). These are unique to each person. We are able to communicate because we have a shared understanding of concepts and their interrelationships, and are able to map semantic pointers to words. RDF's blank nodes are equivalent to internal identifiers, which are clearly needed, but whose meaning is implicit in the graph structure. |
Another direction is "shapes" which describe the structure of the data. These are descriptions - they can be used for validation or they can be taken as definitions. Both SHACL Compact Syntax and ShEx Compact Syntax provide a modelling view where relationship and attributes are more clearly identified. Another aspect is that there is a tools role here to provide the view - not purely data format issue. |
ref: #45 (comment) What I wrote is whether one can do everything that GraphDB does with RDF: the answer is yes, it can even do more. I did not write how. If we consider GraphDB and Property Graph two differents things: a GraphDB is software, a Property Graph is concept. You can not with a GraphDB query by key-value pairs, such as: give me THE vertex with the uid=42. Unlike with my implementation of property graphs on top of RDF.
In my system there is no difference at the RDF level between items of SPO, they can all take the same types of objects, it is up the user to choose the schema even at that level.
I do not understand the last point, what is a sub-property? Here is my approach:
That is misunderstanding RDF to say "there is a clean ER/class model representation [in my upper layer on top of RDF]". RDF is built out-of relations, the basic nature is a network between entities where a link is directed and has a label. There is many ways to add properties to Bob or Aziz, or even add relation to Bob knows Aziz, unlike in a GraphDB, you can not relate to an edge, reifying the edge into a vertex and edges, hence introduce the metatype of hyperedge.
In you approach yes, they may be only one edge between two vertex, unlike my approach.
See above for an alternative. |
@amirouche thank you for the reply. I haven't completly understood everything you've written but I have a feeling that we are getting closer. First clairification of what I meant: Consider a RDF graph. Now you also have a (legacy) application that wants to import the data. Let's assume you have a Java application with a class domain model. Then you must do a model-model-transformation. This mapping process includes a specific interpretation of the RDF data: some predicates are interpreted to be members of a the objects, some predicates are interpreted to be associations between objects (by the way: languages like Java and C++ also do not distinguish between association and membership, but this is another story). I have been working on how to standardize these mapping process so that every (legacy) application can import/export data to from/to RDF. This is a very important practical thing. I call RDF "academic" because it seems to me that the RDF/Semantic Web world somehow ignores the reality that RDF must interact with existing applications (and please don't give me the RDFa story!) in way that the common developer can use it. Again a picture. Objects or entity instances are identified by URI nodes. Some predicates therefore become associations/relations and other become properties/members. For properties you can draw an analogy to XML Schema. There are simple properties == properties that are coded as simple strings. I think we all agree that these are just RDF literals. There are also complex properties == structures like lists or maps. They are considered to be sub-graphs "starting" with a blank node. A restriction here is that loops in the complex-property-sub-graphs are forbidden and they must be trees. I call literals or other blank node attached to blank nodes "sub-properties". But the whole "blank-literal-sub-graph" is considered to be a member of the object. |
The scheme you describe here seems to be a higher level of modelling, i.e. is this meant to be a graph "meta-model" ? |
To narrow it down (sorry for bothering you but it is basically a very simple question) another example. Here is a domain model, the (naive) implemenation and RDF graph data. Yes, you can say: why not annotating the the Java code similar to JaxB (e.g. using RDFBeans)?? Because this implies existing knowledge about the data and it does not answer the question: how do you see in the RDF graph what is supposed to be a property and what a relation ? |
You may think of name, firstName and lastName as properties, but you could equally think of them as predicates. It doesn't make any real difference, and there's nothing to stop you classifying predicates as properties or links in an ontology. |
Yes.
Thanks for the feedback. I understand better the problem with the following:
And the following:
Sort of an Object-Relational-Mapper (ORM) where instead of an SQL database, there is a RDF database. In other words, Map RDF concepts to Java concepts. In an ORM such as Hibernate, as of 2010, a Java class will describe a table where columns are described with annotations (IIRC), then the a row of the table will be represented as an object instance of that class, getters and setters to access column values. IIRC, SQL is also built with Java code using method chaining. FWIW, most of my experience is with Python ORMs, and I also built a Object-Graph-Mapper.
I am an outsider of RDF or W3C. I came to RDF from Tinkerpop / Neo4J. Part of the reason I came to RDF is the academic thing that I prefer to describe as a lot of experience that are gathered in the same place with a lot of energy, an open process, a system that is well studied along various aspects, with several independent industrial implementations. RDF can prolly be perfected. Also, be warned that my system does aim to be 100% compliant with RDF! I cherry picked ideas (e.g. my system support SPARQL queries and Tinkerpop's Gremlin queries, they can mixed-and-matched)
FWIW, I do not think I match that description (e.g. I prefer to avoid ORMs), so take what I write with a grain of salt. Is your goal to standardize read and write access to an RDF database, such as ORM do with SQL databases, in other words, build a framework to interop a RDF databases with Java re-using Java concepts? If that is the case, I am not sure how it relates to this issue. Also quoting the other issue:
Who are the users? As far I as I am concerned catching up on 80% of what I know about RDF can be summarized with SPARQL, and this tutorial: https://docs.data.world/tutorials/sparql/. My recommendation is to create a new issue with a specific question, e.g.: How to map RDF concepts to Java concepts? |
@mhedenus ER models come from the RDBMSs which came along in the 70s or so. The web on the other hand appeared in the 90s, and then RDF was designed for data interchange on the web. That's why it has URIs, the Open world assumption (OWA) etc. So there is an inherent mismatch between those models, and trying to shoehorn one into the other will leave you with the worst of both. To take full advantage of RDF you have to go fully in. Design your software around RDF, not the way around. Throw out the ORMs and pretty much all of the object-oriented layer. Accept that there are only triples (or quads), and they do not distinguish between properties or relations. |
You can create an ontology using OWL and define classes like Man and Woman. And you can make assertions about OWL-Properties like |
The line of argumentation is very odd and completly unrealistic. Saying that RDF is younger does not make the other things worthless. Also the remark on OWA and CWA is out of scope, this is something completly different. By the way: SHACL showed up because the reality is that you cannot live without validation and CWA. That's why Stardog introduced ICV!
The opposite: the best of both!
I want to live in your world, it seems paradise! ;D Do you drive a VW or Audi? Then it is very likey that the software in your car's enigne has been developed here in Regensburg. Please come and tell these engineers to forget what they learned about UML and that they shall restart with OWL |
@mhedenus you haven't disputed that there's an inherent mismatch between the models. There's an impedance mismatch even between the relational and object-oriented models, that's why the ORMs have all kinds of edge cases. I'm not telling anyone how to work, my concern is pushing the web to its full potential and making it data-driven by using RDF and declarative technologies. I am just sharing my experiences. We have explained them in more detail in our blog. If you don't have time for that, at least take a look at a specification which enables generic REST APIs and makes web applications data-driven, or more specifically ontology-driven: https://atomgraph.github.io/Linked-Data-Templates/ |
@amirouche Developing a Java-graph mapping would be the final result of what I want to discuss here. Before that conceptual questions must be answered. I used this thread because I consider my obviously weird question about relation/properties as the key issue. If you can unify PG and RDF conceptually then a Java-graph mapper can be the realization! |
@namedgraph Looking at the links you provided that all looks great! One thing should not be forgotten: We all want to advocate RDF! |
Amen! Do you know this book series BTW? Software Wasteland and The Data-Centric Revolution. If you want to see our approach working in practice, drop me a line :) martynas [at] atomgraph.com |
@namedgraph Thank you very much for the hints. The books look very interesting. |
I was going to reply something similar. I came to the realization that object mappers such as ORM / ODM / OGM are a pipe dream before diving into Scheme and RDF. It may have some use to describe a schema with a set of Java class with annotations in cases where there is no other way to do it.
I have done that journey when I was younger, I started with a Java, UML, SQL, I still do my daily chore with an ORM. The physical barrier is bigger and stronger obstacle that those you mentioned (see also the software crisis). Check out Apache Jena, I do not think there will be a better answer elsewhere. |
Thank you all for having this conversation. One outcome for me was that I have to present my position more precisely. |
I would find it helpful if you could precisely itemize the differences between what you are calling a "property" versus a "relation", so that I can understand the distinction you are trying to make. What is true of a "property" that is not true of a "relation", and vice versa? What can I do with a "property" that I cannot do with a "relation", and vice versa? What characteristics do "properties" have that "relations" do not have, and vice versa? How are "properties" written or depicted, in contrast with "relations", and vice versa? If you could provide a concise list of the differences, it would help. |
I've been following this discussion, and it hasn't been clear to me what you are saying.
Your note seems to help.
Here is my take/echo to you.
You are casting this as a difference in the ability to model systems.
However, in fact, you use both RDF & PG to model things, without showing any difference in expressibility, which is where my difficulty is/was, because you seemed to be saying that there are things in one that you can't model in the other.
What you are describing, to me, is more a difference in discrimination.
PGs have two ways of modelling the situations you discuss, whereas RDF has a single way (in your characterisation)
And it may be that having that difference is useful, although as you seem to say they are interchangeable, and the modeller has a choice, any difference must simply be down to the culture of what people do when using PGs.
(Many would argue that having two ways of modelling the same thing is actually a Really Bad Thing.)
You seem to be seeing a very deep difference, whereas it feels like you are describing a pretty shallow difference, that is almost at a representational or even syntactic level.
You conclude that you can go from PG to RDF without any challenges (2 goes to 1), but going from RDF to PG means that you need to make a decision about which of the two choices you make (1 goes to which of 2).
I think I can see why you see it as a modelling problem, but that seems a weird way of casting it to me.
The world you want can effectively be modelled by either system.
It is just that you can't move between them so easily.
Well, in fact I am guessing you could, if you simply decided not to use the property stuff of the PG!
I am pleased to see you start of with "Naming Things Unambiguously" as an issue.
This seems to me probably the biggest challenge about interchanging between RDF & PGs.
Cheers
… On 7 May 2021, at 10:26, Michael Hedenus ***@***.***> wrote:
Thank you all for having this conversation. One outcome for me was that I have to present my position more precisely.
I have written an essay that I want to bring to your attention: https://github.com/mhedenus/on_graphs_and_models
Any comment is appreciated.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub, or unsubscribe.
--
Hugh Glaser
CEO
Seme4 Limited
International House
Southampton International Business Park
Southampton
Hampshire
SO18 2RZ
Mobile: +44 7595 334155
***@***.***
www.seme4.com
|
@mhedenus you're still thinking in ER terms and as long as you do that you will be seeing some mismatch in RDF. But in practical terms, what prevents you from defining
after which your example becomes
and you have distinction between "properties" and "relations" and it still makes sense semantically (unless I messed up the subclassing). |
Thank you all for reading my note and making this excellent remarks! @HughGlaser Your summarised my thoughts very nicely. Maybe the term "expressivness" is misleading. It did not mean that either graph style is more powerful. As you said they both can model the world but they do it differently. I will try to respond to your objections. They all come down to the questions: is the property/relation question is a real deep issue or is it a superflous pettifoggery of a shallow difference? If there are differences, can they be listed (are the sufficient arguments for beeig a property or relation?) The answer may be a bit surprising. I used the term "graph style" for a reason. There is the concept of Thought Style. It is an important concept in the field of history of sciences, it is the basis of the concept Paradigm. To simplify it: you have a style of thinking that is shaped by your context. I make a hypothesis: there are two thought styles here, the context of RDF/Linked Data and the context of Property Graph/Applied Mathematics. A member of the first groups says: "What are you talking about? There is no difference!" A member of the other group might say: "Why don't you see it?" I do not want to convince you to adopt anything, but I do want you to accept that the difference property/relation is made by others. You cannot deny that ER or (UML) class models exist and they distinguish between membership and association. So, I rephrase the question of @HughGlaser and @dbooth-boston: IF you accept the difference between relation/property THEN how do you distinguish them? Well, this is a very good question and the discussion can be extremly deep if not confusing (for example: intrinsic versus extrinsic properties). This is not the right place for such a discussion. Putting all philosophical questions aside I think finally it is choice that is made by the guys who create the model. (Whether or not it is a bad thing to have this choice is again another question!) I would list following general rules:
@namedgraph : I am sorry, but RDF is not what you are showing. RDF is a labeled directed graph plus three nodes types: IRI, Blank and Literal. There is a restriction that limits Literals to be leave nodes. This additional feature changes the graph completly. If you define a |
Can you use named graphs as "sub-graphs"? And have you looked at other RDF -> PG mapping approaches? For example: |
Named graphs as property value? I didn't think about it, but this is a very interesting idea. Why not? The link you provided is good example for what I mean: you must specify a mapping. If you are on the PG side as the active consumer, the problem is only a technical one, because you know how to map. But there is no general solution without any additional information. |
I say the best way is to create a new vocabulary. Here is a sketch. I use prefixes to make clear what I mean:
Example:
Now imagine an PG visualization application that parses this RDF. There should not be any problem. I would be happy if the W3C would adopt this initiative and develop a Property Graph Ontology. |
I say an even better way is to reuse one that already exists :-) +1 to what you wrote about 'Thought styles'. Each paradigm has some features "baked-in" (property-relationship distinction for PG, unique identifiers for RDF...) because they were considered essential by the community in which they appeared. Other features can always be added through extra layers (a conventional 'iri' property on each node for PG, a meta-ontology in RDF such as the one you proposed above). In the end, as @HughGlaser points out, the expressiveness is roughly the same, but the trade-offs differ. NB: another place where the property/relationship distinction can be made is in the visualization layer. Great example at https://vitalis-wiens.github.io/donatello-pipelines/ |
That paper looks very interesting! It seems their focus is on transforming PG to RDF. For me the focus would be on the other direction RDF to PG! |
This deserves an issue to itself given the growing popularity of property graph databases and the opportunity for using RDF as an interchange framework between different databases. See also #20 Standardized n-ary relations (and property graphs) and #22 Language-tagged strings.
Property Graphs are a kind of graphs consisting of nodes and links between them where nodes and links may be associated with a set of property-value pairs, where the values may themselves be sets of property-values and so forth recursively. The link predicate or label can itself be treated as a kind of property.
It is possible to represent property graphs with reification, but that adds considerable complexity. We can easily annotate a node using a link to another node. However, we also need a way to link from a link or to a link. One approach is for each link to expose an identifier enabling the link to be treated as equivalent to an RDF blank node. Such identifiers are okay for links within the same graph and can be implicit in serialisations like Turtle* where a pair of curly braces implies a new identifier.
What if you want to make a link something that can be referenced stably from other graphs? That suggests the need for a means to associate the link with a named anchor that is unique within the graph. What if the link itself starts in one graph and ends in another - where would you situate the anchor for that link? The answer would seem to be the graph that the link was defined in.
Another challenge concerns the case where a node stands for another graph, e.g. the node has a URI that can be dereferenced to obtain the graph the node stands for. This allows you to make statements about a graph as a whole rather than one of its nodes or links. It would be desirable to quickly determine that a node indeed stands for a graph so as to avoid having to find this out by trying to deference the node.
Yet another challenge is where you want to distinguish properties from other kinds of links. This would allow for visualisations where you can hide and reveal properties with a tabular presentation of property-value sets. See #37 Lack of RDF Visualisation Software.
It would be desirable to have short names for links so that paths through a graph can be expressed simply via a dotted path string, analogous to properties in object oriented programming languages. Such short names could be scoped to the node that acts as the subject for a link, or the root for a n-ary chunk.
The text was updated successfully, but these errors were encountered: