Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Percentile/Ranks should return null instead of NaN when empty #30460

Merged
merged 10 commits into from
Jun 18, 2018
Merged
6 changes: 6 additions & 0 deletions docs/reference/release-notes/7.0.0-alpha1.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -16,3 +16,9 @@ Cross-Cluster-Search::

Rest API::
* The Clear Cache API only supports `POST` as HTTP method

Aggregations::
* The Percentiles and PercentileRanks aggregations now return `null` in the REST response,
instead of `NaN`. This makes it consistent with the rest of the aggregations. Note:
this only applies to the REST response, the java objects continue to return `NaN` (also
consistent with other aggregations)
Original file line number Diff line number Diff line change
Expand Up @@ -79,7 +79,12 @@ protected MultiValue(StreamInput in) throws IOException {
public abstract double value(String name);

public String valueAsString(String name) {
return format.format(value(name)).toString();
// Explicitly check for NaN, since it formats to "�" or "NaN" depending on JDK version
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this only the case for certain locales? I would bne suprised if some JDKs would return a weird UTF8 character in all cases. To make this comment more readable it would probably also make sense to put in the bad utf8 value as octal or hex codepoint and to clarify under which circumstances this happens.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not actually sure how this behaves across Locales, but I don't think it matters for us. We seem to always initialize the Decimal DocValueFormat with Locale.Root which I believe uses the JRE's default symbol table.

So for JDK8 the root locale will use JRELocaleProviderAdapter to get the symbols, which loads sun.text.resources.FormatData, and you can see the NaN symbol is \uFFFD

For JDK 9+, the root locale will use CLDRLocaleProviderAdapter, which loads sun.text.resources.cldr.FormatData. And in that resource file you can see the NaN symbol is "NaN" (Can't find a link to the code, but you can see it in your IDE).

++ to making the comment more descriptive. I'll try to distill this thread into a sane comment, and probably leave a reference to the comments here in case anyone wants to see more info.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As an aside, I really wonder why Oracle thought � would be a good default representation of "NaN"... :(

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, I don't like how this was implemented, looking at it. Going to move it over to the DocValueFormat itself, so that it only applies to the Decimal formatter when looking at doubles... otherwise it'll be checked against all formatters (geo, IP, etc). Harmless I think, but no need.

Double value = value(name);
if (value.isNaN()) {
return String.valueOf(Double.NaN);
}
return format.format(value).toString();
}

@Override
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -92,9 +92,9 @@ protected XContentBuilder doXContentBody(XContentBuilder builder, Params params)
builder.startObject(CommonFields.VALUES.getPreferredName());
for (Map.Entry<Double, Double> percentile : percentiles.entrySet()) {
Double key = percentile.getKey();
builder.field(String.valueOf(key), percentile.getValue());

if (valuesAsString) {
Double value = percentile.getValue();
builder.field(String.valueOf(key), value.isNaN() ? null : value);
if (valuesAsString && value.isNaN() == false) {
builder.field(key + "_as_string", getPercentileAsString(key));
}
}
Expand All @@ -106,8 +106,9 @@ protected XContentBuilder doXContentBody(XContentBuilder builder, Params params)
builder.startObject();
{
builder.field(CommonFields.KEY.getPreferredName(), key);
builder.field(CommonFields.VALUE.getPreferredName(), percentile.getValue());
if (valuesAsString) {
Double value = percentile.getValue();
builder.field(CommonFields.VALUE.getPreferredName(), value.isNaN() ? null : value);
if (valuesAsString && value.isNaN() == false) {
builder.field(CommonFields.VALUE_AS_STRING.getPreferredName(), getPercentileAsString(key));
}
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -123,9 +123,9 @@ public XContentBuilder doXContentBody(XContentBuilder builder, Params params) th
for(int i = 0; i < keys.length; ++i) {
String key = String.valueOf(keys[i]);
double value = value(keys[i]);
builder.field(key, value);
if (format != DocValueFormat.RAW) {
builder.field(key + "_as_string", format.format(value));
builder.field(key, state.getTotalCount() == 0 ? null : value);
if (format != DocValueFormat.RAW && state.getTotalCount() > 0) {
builder.field(key + "_as_string", format.format(value).toString());
}
}
builder.endObject();
Expand All @@ -135,8 +135,8 @@ public XContentBuilder doXContentBody(XContentBuilder builder, Params params) th
double value = value(keys[i]);
builder.startObject();
builder.field(CommonFields.KEY.getPreferredName(), keys[i]);
builder.field(CommonFields.VALUE.getPreferredName(), value);
if (format != DocValueFormat.RAW) {
builder.field(CommonFields.VALUE.getPreferredName(), state.getTotalCount() == 0 ? null : value);
if (format != DocValueFormat.RAW && state.getTotalCount() > 0) {
builder.field(CommonFields.VALUE_AS_STRING.getPreferredName(), format.format(value).toString());
}
builder.endObject();
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -106,9 +106,9 @@ public XContentBuilder doXContentBody(XContentBuilder builder, Params params) th
for(int i = 0; i < keys.length; ++i) {
String key = String.valueOf(keys[i]);
double value = value(keys[i]);
builder.field(key, value);
if (format != DocValueFormat.RAW) {
builder.field(key + "_as_string", format.format(value));
builder.field(key, state.size() == 0 ? null : value);
if (format != DocValueFormat.RAW && state.size() > 0) {
builder.field(key + "_as_string", format.format(value).toString());
}
}
builder.endObject();
Expand All @@ -118,8 +118,8 @@ public XContentBuilder doXContentBody(XContentBuilder builder, Params params) th
double value = value(keys[i]);
builder.startObject();
builder.field(CommonFields.KEY.getPreferredName(), keys[i]);
builder.field(CommonFields.VALUE.getPreferredName(), value);
if (format != DocValueFormat.RAW) {
builder.field(CommonFields.VALUE.getPreferredName(), state.size() == 0 ? null : value);
if (format != DocValueFormat.RAW && state.size() > 0) {
builder.field(CommonFields.VALUE_AS_STRING.getPreferredName(), format.format(value).toString());
}
builder.endObject();
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,10 @@

package org.elasticsearch.search.aggregations.metrics.percentiles;

import org.elasticsearch.common.Strings;
import org.elasticsearch.common.xcontent.ToXContent;
import org.elasticsearch.common.xcontent.XContentBuilder;
import org.elasticsearch.common.xcontent.json.JsonXContent;
import org.elasticsearch.search.DocValueFormat;
import org.elasticsearch.search.aggregations.Aggregation.CommonFields;
import org.elasticsearch.search.aggregations.InternalAggregation;
Expand All @@ -27,11 +31,14 @@

import java.io.IOException;
import java.util.Arrays;
import java.util.Collections;
import java.util.Iterator;
import java.util.List;
import java.util.Map;
import java.util.function.Predicate;

import static org.hamcrest.Matchers.equalTo;

public abstract class AbstractPercentilesTestCase<T extends InternalAggregation & Iterable<Percentile>>
extends InternalAggregationTestCase<T> {

Expand All @@ -49,7 +56,7 @@ public void setUp() throws Exception {

@Override
protected T createTestInstance(String name, List<PipelineAggregator> pipelineAggregators, Map<String, Object> metaData) {
int numValues = randomInt(100);
int numValues = frequently() ? randomInt(100) : 0;
double[] values = new double[numValues];
for (int i = 0; i < numValues; ++i) {
values[i] = randomDouble();
Expand Down Expand Up @@ -89,4 +96,53 @@ public static double[] randomPercents(boolean sorted) {
protected Predicate<String> excludePathsFromXContentInsertion() {
return path -> path.endsWith(CommonFields.VALUES.getPreferredName());
}

protected abstract void assertPercentile(T agg, Double value);

public void testEmptyRanksXContent() throws IOException {
double[] percents = new double[]{1,2,3};
boolean keyed = randomBoolean();
DocValueFormat docValueFormat = randomNumericDocValueFormat();

T agg = createTestInstance("test", Collections.emptyList(), Collections.emptyMap(), keyed, docValueFormat, percents, new double[0]);

for (Percentile percentile : agg) {
Double value = percentile.getValue();
assertPercentile(agg, value);
}

XContentBuilder builder = JsonXContent.contentBuilder().prettyPrint();
builder.startObject();
agg.doXContentBody(builder, ToXContent.EMPTY_PARAMS);
builder.endObject();
String expected;
if (keyed) {
expected = "{\n" +
" \"values\" : {\n" +
" \"1.0\" : null,\n" +
" \"2.0\" : null,\n" +
" \"3.0\" : null\n" +
" }\n" +
"}";
} else {
expected = "{\n" +
" \"values\" : [\n" +
" {\n" +
" \"key\" : 1.0,\n" +
" \"value\" : null\n" +
" },\n" +
" {\n" +
" \"key\" : 2.0,\n" +
" \"value\" : null\n" +
" },\n" +
" {\n" +
" \"key\" : 3.0,\n" +
" \"value\" : null\n" +
" }\n" +
" ]\n" +
"}";
}

assertThat(Strings.toString(builder), equalTo(expected));
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,8 @@
import org.elasticsearch.search.aggregations.InternalAggregation;
import org.elasticsearch.search.aggregations.ParsedAggregation;

import static org.hamcrest.Matchers.equalTo;

public abstract class InternalPercentilesRanksTestCase<T extends InternalAggregation & PercentileRanks>
extends AbstractPercentilesTestCase<T> {

Expand All @@ -39,4 +41,10 @@ protected final void assertFromXContent(T aggregation, ParsedAggregation parsedA
Class<? extends ParsedPercentiles> parsedClass = implementationClass();
assertTrue(parsedClass != null && parsedClass.isInstance(parsedAggregation));
}

@Override
protected void assertPercentile(T agg, Double value) {
assertThat(agg.percent(value), equalTo(Double.NaN));
assertThat(agg.percentAsString(value), equalTo("NaN"));
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,8 @@

import java.util.List;

import static org.hamcrest.Matchers.equalTo;

public abstract class InternalPercentilesTestCase<T extends InternalAggregation & Percentiles> extends AbstractPercentilesTestCase<T> {

@Override
Expand All @@ -49,4 +51,10 @@ public static double[] randomPercents() {
}
return percents;
}

@Override
protected void assertPercentile(T agg, Double value) {
assertThat(agg.percentile(value), equalTo(Double.NaN));
assertThat(agg.percentileAsString(value), equalTo("NaN"));
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,7 @@
import java.util.List;
import java.util.Map;


public class InternalHDRPercentilesRanksTests extends InternalPercentilesRanksTestCase<InternalHDRPercentileRanks> {

@Override
Expand Down