-
-
Notifications
You must be signed in to change notification settings - Fork 8.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
XGBoost4j-spark multiclass with objective "multi:softmax" returns incorrect prediction column value #7643
Comments
softmax is a bit problematic in general as it removes the probability by reducing it to the output label. We can add a check to remove the probability column if softmax is used. |
The main issue is that the softmax implementation seems to be capturing the wrong prediction label (in the case above it assigns all predictions as label==0 when the rawPrediction values in the screenshot show that class label "1" should have been in the prediction column. |
Thank you for raising the issue. @wbo4958 Could you please help take a look when you are available? |
Will check this issue. |
Yeah, Just reproduced locally, same issue. We need to fix this. |
Maybe related: #3506 . |
Testing the following script within Apache Spark with the XGBoost4j-spark version 1.5.1 generates an incorrect mapping on the prediction column to the dominant class member value from the rawPrediction generated column on a transform:
The output example for this is below:
This is not the same behavior for the objective "multi:softprob" which will return the correct
prediction
column values from the sourcerawPrediction
column.Environment:
Apache Spark 3.2.1
Scala 2.12
XGBoost4j-spark 1.15.1
The text was updated successfully, but these errors were encountered: