Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adjust ExpressionAggregationScript to support inter-segment concurrency #99667

Merged
merged 3 commits into from
Sep 21, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -502,7 +502,6 @@ public void testSpecialValueVariable() throws Exception {
assertThat(stats.getAvg(), equalTo(3.0));
}

@AwaitsFix(bugUrl = "https://github.com/elastic/elasticsearch/issues/99156")
public void testStringSpecialValueVariable() throws Exception {
// i.e. expression script for term aggregations, which is not allowed
assertAcked(indicesAdmin().prepareCreate("test").setMapping("text", "type=keyword").get());
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -83,7 +83,7 @@ public void setNextAggregationValue(Object value) {
// _value isn't used in script if specialValue == null
if (specialValue != null) {
if (value instanceof Number) {
specialValue.setValue(((Number) value).doubleValue());
specialValue.setValue(leaf, ((Number) value).doubleValue());
} else {
throw new GeneralScriptException("Cannot use expression with text variable using " + exprScript);
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -15,20 +15,21 @@
import org.apache.lucene.search.IndexSearcher;

import java.io.IOException;
import java.util.Map;
import java.util.concurrent.ConcurrentHashMap;

/**
* A {@link DoubleValuesSource} which has a stub {@link DoubleValues} that holds a dynamically replaceable constant double.
*/
final class ReplaceableConstDoubleValueSource extends DoubleValuesSource {
final ReplaceableConstDoubleValues fv;

ReplaceableConstDoubleValueSource() {
fv = new ReplaceableConstDoubleValues();
}
private final Map<LeafReaderContext, ReplaceableConstDoubleValues> specialValues = new ConcurrentHashMap<>();
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not entirely happy with this solution, but given that we need a global data structure exposed in the bindings, which will retrieve values per segment, I think we don't have many other options than tracking this special value per segment. This needs to be a concurrent hash map because getValues and setValue are called concurrently from different threads (one per slice)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm also uneasy, and I think the longer-term solution here is that we need to build the aggregations per-slice rather than globally. But that's a much bigger change, so I think I'm happy with this as a quick fix.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this will significantly hurt performance on expression score scripts which also use this class.

Copy link
Member

@rjernst rjernst Sep 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it will, because it would only try to access the map if using the value source, which will only occur if _value appears in the score script, which is (I think) bogus.

Copy link
Contributor

@jdconrad jdconrad Sep 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point! What about aggregation scripts?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_value is specifically part of aggregation scripts. But I do not believe explain would ever be called on an it in that context.


@Override
public DoubleValues getValues(LeafReaderContext ctx, DoubleValues scores) throws IOException {
return fv;
ReplaceableConstDoubleValues replaceableConstDoubleValues = new ReplaceableConstDoubleValues();
specialValues.put(ctx, replaceableConstDoubleValues);
return replaceableConstDoubleValues;
}

@Override
Expand All @@ -38,8 +39,12 @@ public boolean needsScores() {

@Override
public Explanation explain(LeafReaderContext ctx, int docId, Explanation scoreExplanation) throws IOException {
if (fv.advanceExact(docId)) return Explanation.match((float) fv.doubleValue(), "ReplaceableConstDoubleValues");
else return Explanation.noMatch("ReplaceableConstDoubleValues");
// TODO where is this explain called? I bet it's never tested, and probably never called.
ReplaceableConstDoubleValues fv = specialValues.get(ctx);
if (fv.advanceExact(docId)) {
return Explanation.match((float) fv.doubleValue(), "ReplaceableConstDoubleValues");
}
return Explanation.noMatch("ReplaceableConstDoubleValues");
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do aggregation scripts get explained? I am thinking this may never get called. If anyone knows how this gets called, that would help me figure out how to test this, otherwise I am thinking of throwing UnsupportedOperationException in this method, just to surface that it is never called.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The class is also used for score scripts, so it does need to implement explain I think.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you show me where? I can't find that usage.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. I am perplexed, because the score usage does not pass the replaceable around, which makes it not replaceable as setValue will never be called. Effectively, will the value not always be 0? Maybe @jdconrad can help us here :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried replacing the replaceable usage in scoring with a constant and I got no failures: does it mean that it's not needed or that we have no test coverage for it? Also, if it is a constant (0) do we even need to add it to the bindings. All in all, I am thinking that we could clean things up and get rid of the explain here, as I don't find a way to test my changes to it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe it should never be called for score scripts, since it is specifically for _value, which is not a thing there. However, I vaguely recall some issue when migrating to score script context, which caused me to add _value to score scripts.

Could you try throwing UnsupportedOperationException and see if anything fails?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like this will never be called by anything. From what I can tell explain will only work on a script score query using Painless.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks a lot for looking into this @jdconrad & @rjernst . I will try to clean this up further as a follow-up, possibly removing the explain impl if never called.

}

@Override
Expand All @@ -52,7 +57,9 @@ public int hashCode() {
return System.identityHashCode(this);
}

public void setValue(double v) {
public void setValue(LeafReaderContext ctx, double v) {
ReplaceableConstDoubleValues fv = specialValues.get(ctx);
assert fv != null : "getValues must be called before setValue for any given leaf reader context";
fv.setValue(v);
}

Expand Down