Support fit_params in stacking #18028

a-wozniakowski · 2020-07-30T02:16:16Z

Currently, there is no support for **fit_params in the fit method of _BaseStacking:

scikit-learn/sklearn/ensemble/_stacking.py

Line 110 in fd23727

def fit(self, X, y, sample_weight=None):

As introduced in issue #15953 for _MultiOutputEstimator, it seems natural to extend the utility to stacking. A proposed implementation in the base stacking class is as follows:

from ..utils.validation import _check_fit_params

def fit(self, X, y, sample_weight=None, **fit_params):
    # Right before predictions = Parallel...
    if fit_params:
        fit_params = _check_fit_params(X, fit_params)
    else:
        fit_params = (dict(sample_weight=sample_weight)
                      if sample_weight is not None
                      else None)
    # Then, utilize fit_params in the parallelized cross_val_predict

Subsequently, alter the fit methods for StackingClassifier and StackingRegressor such that they support **fit_params. If this is favorable, then I can write an implementation and start the pull request.

The text was updated successfully, but these errors were encountered:

jnothman · 2020-07-30T07:41:05Z

The difference with MultiOutputEstimator is that there the sample_weight would need to be passed only to a single class of estimator, where here it is heterogeneous. Should it handle fit_params like in a Pipeline and deal with name-based prefixing? Should it handle fit_params like a FeatureUnion and pass the same fit params to all estimators involved? Possibly best to wait on scikit-learn/enhancement_proposals#16... we're working on it!

a-wozniakowski · 2020-08-03T03:07:37Z

@jnothman thanks for your reply. From an earlier issue, sample weights appear to be the underlying motivation for the design principle in FeatureUnion:

"I think fit_params is more or less just a more general way to implement sample_weights, with a slightly different API. I tried to implement sample_props once and it was mostly renaming fit_params (and sometimes moving it from __init__ to fit)"

Originally posted by @amueller in #7136 (comment)

With _MultiOutputEstimator and _BaseStacking there is also the motivation of early stopping (which does not apply to transformers). As the estimators in stacking may be heterogeneous, the name-based prefixing seems less error prone.

a-wozniakowski added the New Feature label Jul 30, 2020

adrinjalali mentioned this issue Oct 29, 2020

[WIP] sample props (proposal 4) #16079

Closed

cmarmo added the module:ensemble label Mar 8, 2021

adrinjalali mentioned this issue Jun 24, 2021

sample-props alternate implementation #20350

Closed

adrinjalali mentioned this issue Aug 18, 2022

FEAT SLEP006: metadata routing infrastructure #24027

Merged

adrinjalali mentioned this issue Apr 26, 2023

SLEP006 - Metadata Routing task list #22893

Open

28 tasks

StefanieSenger mentioned this issue Mar 26, 2024

FEA metadata routing for StackingClassifier and StackingRegressor #28701

Merged

adrinjalali closed this as completed in #28701 May 13, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support fit_params in stacking #18028

Support fit_params in stacking #18028

a-wozniakowski commented Jul 30, 2020

jnothman commented Jul 30, 2020 via email

a-wozniakowski commented Aug 3, 2020

Support fit_params in stacking #18028

Support fit_params in stacking #18028

Comments

a-wozniakowski commented Jul 30, 2020

jnothman commented Jul 30, 2020 via email

a-wozniakowski commented Aug 3, 2020