-
Notifications
You must be signed in to change notification settings - Fork 157
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Extract probability for a tresholded model #981
Comments
No, I think if you want to get probabilisitic predictions for the original (atomic) model, then you need to train the atomic model on the data. You can't eat your cake and have it too. I don't know your use-case, but I suppose you could roll your own wrapper that preserves the behaviour of using MLJ
import CategoricalDistributions
mutable struct Thresholder <: ProbabilisticComposite
model
threshold::Float64
end
# to do the thresholding:
function deterministic(y_probabilistic, threshold)
classes = CategoricalDistributions.classes(y_probabilistic)
map(pdf.(y_probabilistic, classes[2]) .> threshold) do above
above ? classes[2] : classes[1]
end
end
function MLJ.fit(thresholder::Thresholder, verbosity, X, y)
Xs = source(X)
ys = source(y)
mach = machine(thresholder.model, Xs, ys)
y_probabilistic = predict(mach, Xs)
y_deterministic = node(y -> deterministic(y, thresholder.threshold), y_probabilistic)
network_machine = machine(Probabilistic(), Xs, ys;
predict=y_probabilistic,
predict_mode=y_deterministic)
return!(network_machine, thresholder, verbosity)
end
X, y = make_moons()
DecisionTreeClassifier = @load DecisionTreeClassifier pkg=DecisionTree
thresholder = Thresholder(DecisionTreeClassifier(), 0.7)
mach = machine(thresholder, X, y) |> fit!
predict(mach, X) # probabilistic predictions
predict_mode(mach, X) # thresholded predictions
julia> evaluate!(
mach,
operation=[predict_mode, predict],
measure=[accuracy, log_loss],
resampling=CV()
)
Evaluating over 6 folds: 100%[=========================] Time: 0:00:00
PerformanceEvaluation object with these fields:
measure, operation, measurement, per_fold,
per_observation, fitted_params_per_fold,
report_per_fold, train_test_rows
Extract:
┌────────────────────────────────┬──────────────┬─────────────┬─────────┬───────────────────
│ measure │ operation │ measurement │ 1.96*SE │ per_fold ⋯
├────────────────────────────────┼──────────────┼─────────────┼─────────┼───────────────────
│ Accuracy() │ predict_mode │ 0.967 │ 0.0345 │ [0.96, 0.92, 0.9 ⋯
│ LogLoss( │ predict │ 1.2 │ 1.24 │ [1.44, 2.88, 2.8 ⋯
│ tol = 2.220446049250313e-16) │ │ │ │ ⋯
└────────────────────────────────┴──────────────┴─────────────┴─────────┴───────────────────
1 column omitted |
(After JuliaAI/MLJBase.jl#853, the preferred way of exporting the learning network is to replace function MLJ.prefit(thresholder::Thresholder, verbosity, X, y)
Xs = source(X)
ys = source(y)
mach = machine(:model, Xs, ys)
y_probabilistic = predict(mach, Xs)
y_deterministic = node(y -> deterministic(y, thresholder.threshold), y_probabilistic)
(predict=y_probabilistic, predict_mode=y_deterministic)
end and change the subtyping |
In the specific case I used to explore this, the atomic model was wrapped into a |
Yes, it is trained, but the public API doesn't give you a way to produce probabilistic predictions. It is possible to get them without retraining, but it's a hack (non-public API). |
Got it, thanks!! |
Closing as question resolved. |
When calling
predict
on aBinaryThresholdPredictor
, is there a specific incantation that can give the probability of the positive class? Is there a way to "unwrap" the underlying model?The text was updated successfully, but these errors were encountered: