-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEATURE][ML] auc_roc cannot be calculated when there are no inliers/… #42853
Conversation
Pinging @elastic/ml-core |
…outliers Also fixes a bug with the matching query for binary soft classification
b5e4c80
to
15bec23
Compare
@elasticmachine test this |
run tests |
double[] tpPercentiles = percentilesArray(classAgg.getAggregations().get(PERCENTILES), | ||
"[" + getMetricName() + "] requires at least one actual_field to have the value [" + classInfo.getName() + "]"); | ||
double[] fpPercentiles = percentilesArray(restAgg.getAggregations().get(PERCENTILES), | ||
"[" + getMetricName() + "] requires at least one actual_field not to have the value [" + classInfo.getName() + "]"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"[" + getMetricName() + "] requires at least one actual_field not to have the value [" + classInfo.getName() + "]"); | |
"[" + getMetricName() + "] requires at least one actual_field NOT to have the value [" + classInfo.getName() + "]"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm actually thinking it might be better to go with "... requires at least one actual_field to have a different value than [x]". I'll commit that.
List<AucRocPoint> aucRocCurve = buildAucRocCurve(tpPercentiles, fpPercentiles); | ||
double aucRocScore = calculateAucScore(aucRocCurve); | ||
return new Result(aucRocScore, includeCurve ? aucRocCurve : Collections.emptyList()); | ||
} | ||
|
||
private static double[] percentilesArray(Percentiles percentiles) { | ||
private static double[] percentilesArray(Percentiles percentiles, String errorIfUndefined) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[my random thought]
So if IIUC in the case of all the label ending up having the same value (inlier or outlier), the "evaluate" request will fail.
Have you considered an alternative approach in which the error is returned as part of EvaluationMetricResult? This way, the request could succeed and other metrics (like precision, recall, etc.) could be calculated and returned to the user.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a good one, I've thought about it too. However, it would take much more effort and I'm not convinced it's worth it as the workaround is simple and the chances for this edge case to happen are low. If we see users requesting a different behaviour here, we can make this change in the future.
…outliers
Also fixes a bug with the matching query for binary soft classification