-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prevent deep recursion in doc values #60318
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -24,22 +24,88 @@ | |
import org.elasticsearch.index.mapper.MappedFieldType; | ||
import org.elasticsearch.index.mapper.MapperService; | ||
|
||
import java.util.function.Function; | ||
import java.util.ArrayList; | ||
import java.util.List; | ||
import java.util.function.BiFunction; | ||
import java.util.function.Supplier; | ||
|
||
import static java.util.Collections.emptyList; | ||
import static java.util.Collections.unmodifiableList; | ||
|
||
public class SearchLookup { | ||
/** | ||
* The maximum "depth" of field dependencies. When a field's doc values | ||
* depends on another field's doc values and depends on another field's | ||
* doc values that depends on another field's doc values, etc, etc, | ||
* it can make a very deep stack. At some point we're concerned about | ||
* {@link StackOverflowError}. This limits that "depth". | ||
*/ | ||
static final int MAX_FIELD_CHAIN = 30; | ||
|
||
final DocLookup docMap; | ||
/** | ||
* The chain of fields for which this lookup was created used for detecting | ||
* loops in doc values calculations. It is empty for the "top level" lookup | ||
* created for the entire search. When a lookup is created for a field we | ||
* make sure its name isn't in the list then add it to the end. So the | ||
* lookup for the a field named {@code foo} will be {@code ["foo"]}. If | ||
* that field looks up the values of a field named {@code bar} then | ||
* {@code bar}'s lookup will contain {@code ["foo", "bar"]}. | ||
*/ | ||
private final List<String> fieldChain; | ||
private final MapperService mapperService; | ||
private final BiFunction<MappedFieldType, Supplier<SearchLookup>, IndexFieldData<?>> indexFieldDataService; | ||
private final DocLookup docMap; | ||
private final SourceLookup sourceLookup; | ||
private final FieldsLookup fieldsLookup; | ||
|
||
final SourceLookup sourceLookup; | ||
/** | ||
* Create the "top level" field lookup for a search request. | ||
*/ | ||
public SearchLookup( | ||
MapperService mapperService, | ||
BiFunction<MappedFieldType, Supplier<SearchLookup>, IndexFieldData<?>> indexFieldDataService | ||
) { | ||
this(emptyList(), mapperService, indexFieldDataService, new SourceLookup(), new FieldsLookup(mapperService)); | ||
} | ||
|
||
final FieldsLookup fieldsLookup; | ||
private SearchLookup( | ||
List<String> fieldChain, | ||
MapperService mapperService, | ||
BiFunction<MappedFieldType, Supplier<SearchLookup>, IndexFieldData<?>> indexFieldDataService, | ||
SourceLookup sourceLookup, | ||
FieldsLookup fieldsLookup | ||
) { | ||
this.fieldChain = fieldChain; | ||
this.mapperService = mapperService;; | ||
this.indexFieldDataService = indexFieldDataService; | ||
this.docMap = new DocLookup(mapperService, ft -> this.indexFieldDataService.apply(ft, () -> forField(ft.name()))); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Given that the check is performed on DocLookup, would it make sense to move the tracking to DocLookup? |
||
this.sourceLookup = sourceLookup; | ||
this.fieldsLookup = fieldsLookup; | ||
} | ||
|
||
public SearchLookup(MapperService mapperService, Function<MappedFieldType, IndexFieldData<?>> fieldDataLookup) { | ||
docMap = new DocLookup(mapperService, fieldDataLookup); | ||
sourceLookup = new SourceLookup(); | ||
fieldsLookup = new FieldsLookup(mapperService); | ||
/** | ||
* Build a new lookup for a field that needs the lookup to create doc values. | ||
* | ||
* @throws IllegalArgumentException if it detects a loop in the fields required to build doc values | ||
*/ | ||
public SearchLookup forField(String name) { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This goes along my other comments: I wonder if it is necessary to create a new SearchLookup for every field. Maybe we could add references tracking at a lower level so that callers don't need to be adapted, especially given that it's applied only to DocLookup? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'll have a look at pushing the tracking into doc lookup. I'm not sure there is a way to avoid adapting the callers though. I'll check! There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I looked at pushing it down and I don't really like how it turned out, mostly because I need a reference to the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can I see it? curious how it turns out. |
||
List<String> newFieldChain = new ArrayList<>(fieldChain.size() + 1); | ||
newFieldChain.addAll(fieldChain); | ||
newFieldChain.add(name); | ||
if (fieldChain.contains(name)) { | ||
throw new IllegalArgumentException("loop in field definitions: " + newFieldChain); | ||
} | ||
if (newFieldChain.size() > MAX_FIELD_CHAIN) { | ||
throw new IllegalArgumentException("field definition too deep: " + newFieldChain); | ||
} | ||
return new SearchLookup(unmodifiableList(newFieldChain), mapperService, indexFieldDataService, sourceLookup, fieldsLookup); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Maybe we could avoide rebuilding DocLookup every time? |
||
} | ||
|
||
/** | ||
* Builds a new {@linkplain LeafSearchLookup} for the caller to handle | ||
* a {@linkplain LeafReaderContext}. This lookup shares the {@code _source} | ||
* lookup but doesn't share doc values or stored fields. | ||
*/ | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think I don't understand what you meant here? |
||
public LeafSearchLookup getLeafSearchLookup(LeafReaderContext context) { | ||
return new LeafSearchLookup(context, | ||
docMap.getLeafDocLookup(context), | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is a list only for faster lookup right? would it make sense to use a Set?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought about using a set but I figured we only check "contains" very few times so it isn't worth it. I also wanted the ordered output. I know I could get it with a
LinkedHashSet
or something, but that just seems like overkill. I guess we could check it a bunch of times if you a very big number of sub-fields.....