Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Explore parent/child self referential support #11432

Closed
martijnvg opened this issue May 30, 2015 · 27 comments
Closed

Explore parent/child self referential support #11432

martijnvg opened this issue May 30, 2015 · 27 comments
Labels
>enhancement high hanging fruit :Search/Search Search-related issues that do not fall into other categories

Comments

@martijnvg
Copy link
Member

Now that PR #6511 has been merged, self referential parent/child support is explicitly disabled. Self referential parent/child was never explicitly supported. It wasn't documented and no tests existed for this parent/child use case. A long time ago it used to work, but since 0.90 self referential parent/child has not been working correctly. See #7357.

Right now we need to know what is a parent document and what is a child document. We do this via the _type field. The issue with self referential parent child is that we use the same type for both parent and child document and therefore we can't distinguish between parent and child document.

I think we can get around this issue by using the fact if a document has a _parent field instead of the type field for identification. I think this works okay for single level relationships (a parent type and a child type), but for multi level relationships (a parent type, child type and grandchild type) I don't think this works out as what is requested in #8100.

@clintongormley
Copy link
Contributor

We discussed this in FixItFriday. This issue has only been brought up twice in two years, so it is not a widely requested feature. The added complexity of supporting self-referential parent-child seems to outweigh the benefit.

For now, let's leave this model unsupported. We can always revisit this later on if needed.

@NickCraver
Copy link

For what it's worth - we want to use this at Stack Exchange. In our primary database Questions and Answers are the same type: Post, where the PostTypeId column differentiates.

Having self-referential types lets us store and search Posts the same way we do in the database. It dramatically simplifies the data model as well as allows us to do much more atomic indexing of the original content. For example when an answer changes we want to re-index only the answer, not a large document - okay cool, parent/child in itself solves this case.

However, when executing every search we'll have to search all fields like Post.Body twice, eating much more CPU on the ElasticSearch side. It's more of an efficiency problem for our 99% search use case than anything else here. We're of course very open to any solution that works, but as-is the disabling of self-ref nets us an approximate doubling of CPU usage on our primary user-facing ElasticSearch cluster.

@clintongormley
Copy link
Contributor

OK - I'll reopen this for future consideration. Today this is difficult to do, hence the reluctance, so I can't promise any quick answer :)

@martijnvg
Copy link
Member Author

We're of course very open to any solution that works, but as-is the disabling of self-ref nets us an approximate doubling of CPU usage on our primary user-facing ElasticSearch cluster.

Just out of curiosity: is this based on a 1.x release or did you try out parent/child snapshot from master? The query execution should be improved significantly in master as has_child/has_parent is going to perform joins of document likely to be a match and it isn't going to try to join all children back to parents (or visa versa).

@deinspanjer
Copy link

I just wanted to toss another potential use case in here.
I'm working with some reddit submission and comment data right now. Like many comment systems, comments can be replies to other comments. While it is open for discussion whether you want the top level item to be of the same type as the comments, even if you are only dealing with the case of replies to other comments, this kind of self referential parent/child relationship is useful. If you model it as a nested object instead, your queries are now all bound to the single top level object, and I'm not sure how many use cases might be restricted by that.

@stephencelis
Copy link

Any chance we can at the very least get an error when this is attempted? Spent more time than I would have liked trying to set up an arbitrary tree data structure, confused when I kept getting empty results for queries that should have worked.

@clintongormley
Copy link
Contributor

@stephencelis In 2.0 you now get this:

The [_parent.type] option can't point to the same type

@lbornov2
Copy link

lbornov2 commented Dec 5, 2015

This is a pretty common use-case in relational databases (self-references - FK's that reference the same table).

Curious as to how ElasticSearch recommends handling that type of mapping..

@chtombre
Copy link

chtombre commented Jan 8, 2016

+1 for alternative solution... We have huge structural data (folders in folders in folders). Theoretically infinite...
How do you filter out results where a folder in its path has status hidden?

Now we have to do a seperate sql query to our database for this, and it's killing our performance

@jconlon
Copy link

jconlon commented Apr 1, 2016

A self referential parent/child would be the solution for a use-case I have for a location service where locations are parents or children of other locations. Note: I would rather use geolocations for this, but in many indoor cases this is not available.

@dularion
Copy link

dularion commented Jul 5, 2016

+1, a use case for our project would be a tree structure of companies, having sub-companies in an endless structure. We would then search for an attribute recursively throughout children, and output the entire parent structure for the found element, so as to not loose the hierarchy structure even if the hit was nested deep.

@gerson721
Copy link

+1, our project has a tree structure of products. Think of a finished product which is made of components/ingredients which are products themselves. We want to also recursively search products (finished products) of a certain type which have some sort of a tick of approval from a governing body.

@blfrantz
Copy link

+1, our project also has object hierarchy as a core concept, with the possibility for self-referential relationships. We designed our model on what we knew worked in SQL (our main DB), and are trying to copy our data into Elastic to leverage its awesome search capabilities. It'd be nice to properly reflect this in Elasticsearch without having to resort to lots of nesting. Mongo seems to be coming around to the fact that relational models do in fact make sense in a lot of cases (despite what they used to claim) and are adding more support for this kind of thing. It'd be nice to see more attention to this in Elasticsearch as well.

@k-oliver
Copy link

k-oliver commented Aug 4, 2017

+1, our project contains persons (identities) with relations to each other: father, mother, child, ... we want to search for persons with matching related persons. example: all persons with children called 'Oliver'.
we only need one level in theese relation, e.g. don't want to search for children of children. so inlining the relations is an option but makes it harder to update the documents and produces a lot of duplicated data.

@AnneMottram
Copy link

AnneMottram commented Aug 30, 2017

+1, our project is also based on object hierarchy - I've had to create a parent-id attribute which I can then feed into a second query as a workaround to this

@postmodem
Copy link

+1, our project uses a lot of self-referential relations. Would love if this was supported.

@koundinyagoparaju
Copy link

+1, We too are facing a similar issue due to self-referential relations. Hoping to see this feature soon!!

@clintongormley
Copy link
Contributor

I'll leave this open but, to set expectations: it is highly unlikely that we will be able to support self-referential types, as it would require coming up with a whole new way of storing parent-child relationships.

@valpackett
Copy link

+1, working on an email index, want to show mail threads. So just traverse a tree and collect everything.

I don't need special relationship storage, my trees aren't that large, amazing performance is not required, but it would be nice to avoid all the extra roundtrips caused by doing recursion on the client.

@sqlboy
Copy link

sqlboy commented Feb 12, 2018

Could really use this

@clintongormley clintongormley added :Search/Search Search-related issues that do not fall into other categories and removed :Parent/Child labels Feb 14, 2018
@maricn
Copy link

maricn commented Mar 3, 2018

+1 would be good fit for modeling a graph in a project we're working on..

@javanna
Copy link
Member

javanna commented Mar 16, 2018

@elastic/es-search-aggs

@mangoer-ys
Copy link

+1, our project also has the similar requirement. folder and file are the same type in our model basing on file system, which constructs a search system on a file system. i hope to see the feature as soon as possible.

@jimczi
Copy link
Contributor

jimczi commented Apr 4, 2019

We don't have plans to work on this feature in the near future and considering that this issue has been opened for 4 years without progress I am going to close it. As Clinton said we'd need a completely new way to store parent/child relations and some of the examples outlined here can be solved with nested fields or field collapsing. We can always revisit this issue if we feel that there is a compelling reason to add support for this.

@jimczi jimczi closed this as completed Apr 4, 2019
@MatheusR42
Copy link

+1 here

@ducanh2110
Copy link

i can create mapping with self referential but can not index to it. Any suggestions for its thanks

@liorluker
Copy link

@jimczi @martijnvg hi, I saw the issue was closed 2.5 years ago. Is planned in ES team roadmap to support it ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>enhancement high hanging fruit :Search/Search Search-related issues that do not fall into other categories
Projects
None yet
Development

No branches or pull requests