Skip to content

Commit

Permalink
Avoid reusing source providers for script based block loaders.
Browse files Browse the repository at this point in the history
This change attempts to solve the following issue:

ValueSourceOperator can read values in row striding or columnar fashion. When values are read in columnar fashion and multiple runtime fields synthetize source then this can cause the same `SourceProvider` evaluation the same range of docs ids multiple times. This can then result in unexpected io errors at the codec level. This is because the same doc value instances are used by `SourceProvider`.

Re-evaluating the same docids is in violation of the contract of the DocIdSetIterator#advance(...) / DocIdSetIterator#advanceExact(...) methods, which documents that unexpected behaviour can occur if target docid is lower than current docid position.

Note that this is only an issue for synthetic source loader and not for stored source loader. And not when executing in row stride fashion which sometimes happen in compute engine and always happen in _search api.

Closes elastic#117644
  • Loading branch information
martijnvg committed Nov 30, 2024
1 parent 64dfed4 commit 619c441
Show file tree
Hide file tree
Showing 3 changed files with 31 additions and 4 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -74,6 +74,7 @@
import java.util.function.Function;
import java.util.function.LongSupplier;
import java.util.function.Predicate;
import java.util.function.Supplier;

import static org.elasticsearch.index.IndexService.parseRuntimeMappings;

Expand Down Expand Up @@ -493,14 +494,18 @@ public boolean containsBrokenAnalysis(String field) {
*/
public SearchLookup lookup() {
if (this.lookup == null) {
SourceProvider sourceProvider = isSourceSynthetic()
? SourceProvider.fromSyntheticSource(mappingLookup.getMapping(), mapperMetrics.sourceFieldMetrics())
: SourceProvider.fromStoredFields();
var sourceProvider = createSourceProvider();
setLookupProviders(sourceProvider, LeafFieldLookupProvider.fromStoredFields());
}
return this.lookup;
}

public SourceProvider createSourceProvider() {
return isSourceSynthetic()
? SourceProvider.fromSyntheticSource(mappingLookup.getMapping(), mapperMetrics.sourceFieldMetrics())
: SourceProvider.fromStoredFields();
}

/**
* Replace the standard source provider and field lookup provider on the SearchLookup
*
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -102,6 +102,14 @@ private SearchLookup(SearchLookup searchLookup, Set<String> fieldChain) {
this.fieldLookupProvider = searchLookup.fieldLookupProvider;
}

private SearchLookup(SearchLookup searchLookup, SourceProvider sourceProvider, Set<String> fieldChain) {
this.fieldChain = Collections.unmodifiableSet(fieldChain);
this.sourceProvider = sourceProvider;
this.fieldTypeLookup = searchLookup.fieldTypeLookup;
this.fieldDataLookup = searchLookup.fieldDataLookup;
this.fieldLookupProvider = searchLookup.fieldLookupProvider;
}

/**
* Creates a copy of the current {@link SearchLookup} that looks fields up in the same way, but also tracks field references
* in order to detect cycles and prevent resolving fields that depend on more than {@link #MAX_FIELD_CHAIN_DEPTH} other fields.
Expand Down Expand Up @@ -144,4 +152,8 @@ public IndexFieldData<?> getForField(MappedFieldType fieldType, MappedFieldType.
public Source getSource(LeafReaderContext ctx, int doc) throws IOException {
return sourceProvider.getSource(ctx, doc);
}

public SearchLookup swapSourceProvider(SourceProvider sourceProvider) {
return new SearchLookup(this, sourceProvider, fieldChain);
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,7 @@
import org.elasticsearch.index.mapper.FieldNamesFieldMapper;
import org.elasticsearch.index.mapper.MappedFieldType;
import org.elasticsearch.index.mapper.NestedLookup;
import org.elasticsearch.index.mapper.SourceFieldMapper;
import org.elasticsearch.index.mapper.SourceLoader;
import org.elasticsearch.index.query.QueryBuilder;
import org.elasticsearch.index.query.QueryBuilders;
Expand Down Expand Up @@ -340,7 +341,16 @@ public MappedFieldType.FieldExtractPreference fieldExtractPreference() {

@Override
public SearchLookup lookup() {
return ctx.lookup();
boolean syntheticSource = SourceFieldMapper.isSynthetic(indexSettings());
var searchLookup = ctx.lookup();
if (syntheticSource) {
// in the context of scripts and when synthetic source is used the search lookup can't always be reused between
// users of SearchLookup. This is only an issue when scripts fallback to _source, but since we can't always
// accurately determine whether a script uses _source, we should do this for script usage. This lookup() method
// is only invoked for scripts / runtime fields, so it is ok to do here.
searchLookup = searchLookup.swapSourceProvider(ctx.createSourceProvider());
}
return searchLookup;
}

@Override
Expand Down

0 comments on commit 619c441

Please sign in to comment.