Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prevent deep recursion in doc values #60318

Closed

Conversation

nik9000
Copy link
Member

@nik9000 nik9000 commented Jul 28, 2020

This prevents loops and deep recursion in doc values. This is important
because the doc values for runtime fields can depend on one another. We
expect that mostly we can catch deep recursion and loops when we prepare
the doc values, but we're not going to rely on that. This adds the test
when looking up the doc values. It adds it to SearchLookup which means
that it is automatically provided to all plugins that implement runtime
fields. This is important because we mean for it to be a "last resort"
to prevent stack overflows.

This prevents loops and deep recursion in doc values. This is important
because the doc values for runtime fields can depend on one another. We
expect that mostly we can catch deep recursion and loops when we prepare
the doc values, but we're not going to rely on that. This adds the test
when looking up the doc values. It adds it to `SearchLookup` which means
that it is automatically provided to all plugins that implement runtime
fields. This is important because we mean for it to be a "last resort"
to prevent stack overflows.
@nik9000 nik9000 added the :Search/Search Search-related issues that do not fall into other categories label Jul 28, 2020
@nik9000 nik9000 requested a review from javanna July 28, 2020 16:46
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-search (:Search/Search)

@elasticmachine elasticmachine added the Team:Search Meta label for search team label Jul 28, 2020
@javanna javanna mentioned this pull request Jul 28, 2020
30 tasks
@nik9000
Copy link
Member Author

nik9000 commented Jul 28, 2020

run elasticsearch-ci/1

Copy link
Member

@javanna javanna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left some comments, thanks for working on this!

this.fieldChain = fieldChain;
this.mapperService = mapperService;;
this.indexFieldDataService = indexFieldDataService;
this.docMap = new DocLookup(mapperService, ft -> this.indexFieldDataService.apply(ft, () -> forField(ft.name())));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that the check is performed on DocLookup, would it make sense to move the tracking to DocLookup?

if (newFieldChain.size() > MAX_FIELD_CHAIN) {
throw new IllegalArgumentException("field definition too deep: " + newFieldChain);
}
return new SearchLookup(unmodifiableList(newFieldChain), mapperService, indexFieldDataService, sourceLookup, fieldsLookup);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we could avoide rebuilding DocLookup every time?

* that field looks up the values of a field named {@code bar} then
* {@code bar}'s lookup will contain {@code ["foo", "bar"]}.
*/
private final List<String> fieldChain;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a list only for faster lookup right? would it make sense to use a Set?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about using a set but I figured we only check "contains" very few times so it isn't worth it. I also wanted the ordered output. I know I could get it with a LinkedHashSet or something, but that just seems like overkill. I guess we could check it a bunch of times if you a very big number of sub-fields.....

*
* @throws IllegalArgumentException if it detects a loop in the fields required to build doc values
*/
public SearchLookup forField(String name) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This goes along my other comments: I wonder if it is necessary to create a new SearchLookup for every field. Maybe we could add references tracking at a lower level so that callers don't need to be adapted, especially given that it's applied only to DocLookup?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll have a look at pushing the tracking into doc lookup. I'm not sure there is a way to avoid adapting the callers though. I'll check!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked at pushing it down and I don't really like how it turned out, mostly because I need a reference to the SearchLookup in the DocLookup then which feels pretty circular.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can I see it? curious how it turns out.

* Builds a new {@linkplain LeafSearchLookup} for the caller to handle
* a {@linkplain LeafReaderContext}. This lookup shares the {@code _source}
* lookup but doesn't share doc values or stored fields.
*/
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I don't understand what you meant here?

/**
* Test a loop that only happens on <strong>some</strong> documents.
*/
public void testDocValuesSometimesLoop() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this test really needed? could you think of reasons why you may have cycles only for some docs that could not be detected?

@javanna
Copy link
Member

javanna commented Aug 26, 2020

This is now part of #61430.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Search/Search Search-related issues that do not fall into other categories Team:Search Meta label for search team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants