Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add wrapper to transform targets of (semi)supervised models #678

Merged
merged 5 commits into from
Nov 23, 2021

Conversation

ablaom
Copy link
Member

@ablaom ablaom commented Nov 17, 2021

This PR adds a model wrapper TransformedTargetModel implementing this suggestion of @CameronBieganek. Together with the Pipelinemodel already implemented (on the target branch) this wrapper renders the @pipeline macro redundant and whence resolves this Pluto notebook issue.

The doc-string for the new model appears below.

This PR is not breaking but I am basing it on the 0.19 release branch as it was convenient to use some utilities introduced there for the new pipelines.

To do:

  • add deprecation warnings for @pipeline
  • review test coverage
TransformedTargetModel(model; target=nothing, inverse=nothing, cache=true)

Wrap the supervised or semi-supervised model in a transformation of
the target variable.

Here target is either:

  • The Unsupervised model that is to transform the training target.
    By default (inverse=nothing) the parameters learned by this
    transformer are also used to inverse-transform the predictions of
    model, which means target must implement the inverse_transform
    method. If this is not the case, specify inverse=identity to
    suppress inversion.

or

  • A callable object for transforming the target, such as y -> log.(y). In this case
    a callable inverse, such as z -> exp.(z), should be specified.

Specify cache=false to prioritize memory over speed, or to guarantee data
anonymity.

Specify inverse=identity if model is a probabilistic predictor, as
inverse-transforming sample spaces is not supported. Alternatively,
replace model with a deterministic model, such as Pipeline(model, y -> mode.(y)).

Examples

A model that normalizes the target before applying ridge regression,
with predictions returned on the original scale:

@load RidgeRegressor pkg=MLJLinearModels
model = RidgeRegressor()
tmodel = TransformedTargetModel(model, target=Standardizer())

A model that instead applies a static log transformation to the data, again
returning predictions to the original scale:

tmodel2 = TransformedTargetModel(model, target=y->log.(y), inverse=z->exp.(y))

@ablaom
Copy link
Member Author

ablaom commented Nov 17, 2021

@CameronBieganek Would be great if you could comment on the doc-string. If you also have time for a more detailed review over the next two weeks, let me know.

@codecov-commenter
Copy link

codecov-commenter commented Nov 17, 2021

Codecov Report

Merging #678 (e49a339) into for-0-point-19-release (0f79fe9) will decrease coverage by 2.35%.
The diff coverage is 84.90%.

Impacted file tree graph

@@                    Coverage Diff                     @@
##           for-0-point-19-release     #678      +/-   ##
==========================================================
- Coverage                   85.84%   83.48%   -2.36%     
==========================================================
  Files                          40       41       +1     
  Lines                        3610     3173     -437     
==========================================================
- Hits                         3099     2649     -450     
- Misses                        511      524      +13     
Impacted Files Coverage Δ
src/MLJBase.jl 100.00% <ø> (ø)
src/composition/models/transformed_target_model.jl 84.90% <84.90%> (ø)
src/sources.jl 70.00% <0.00%> (-18.00%) ⬇️
src/data/datasets.jl 86.84% <0.00%> (-13.16%) ⬇️
src/measures/continuous.jl 87.80% <0.00%> (-8.03%) ⬇️
src/show.jl 29.92% <0.00%> (-6.72%) ⬇️
src/measures/measures.jl 68.29% <0.00%> (-5.40%) ⬇️
src/measures/probabilistic.jl 58.46% <0.00%> (-4.70%) ⬇️
src/composition/models/inspection.jl 95.83% <0.00%> (-4.17%) ⬇️
src/measures/finite.jl 93.99% <0.00%> (-4.14%) ⬇️
... and 27 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0f79fe9...e49a339. Read the comment docs.

@ablaom ablaom changed the title Add wrapper to transform targets of (semi)supervised model Add wrapper to transform targets of (semi)supervised models Nov 17, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants