Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable model selection for first stage models #808

Merged
merged 20 commits into from
Nov 11, 2023
Merged

Conversation

kbattocchi
Copy link
Collaborator

No description provided.

@kbattocchi kbattocchi force-pushed the kebatt/modelSelection branch from 3a67554 to a96601a Compare August 18, 2023 17:02
@kbattocchi kbattocchi force-pushed the kebatt/modelSelection branch from a96601a to 3a024ea Compare November 6, 2023 15:04
AnthonyCampbell208 and others added 3 commits November 6, 2023 10:40
Signed-off-by: AnthonyCampbell208 <78286293+AnthonyCampbell208@users.noreply.github.com>

Co-authored-by: ShrutiRM97 <98553136+ShrutiRM97@users.noreply.github.com>
Co-authored-by: CooperGibbs <coopergibbs@Coopers-MacBook-Pro.local>
Signed-off-by: AnthonyCampbell208 <78286293+AnthonyCampbell208@users.noreply.github.com>
…d notebook to showcase some of the functionality

Signed-off-by: AnthonyCampbell208 <78286293+AnthonyCampbell208@users.noreply.github.com>
@kbattocchi kbattocchi force-pushed the kebatt/modelSelection branch 4 times, most recently from 737dad1 to 356f36e Compare November 8, 2023 06:22
Signed-off-by: Keith Battocchi <kebatt@microsoft.com>
@kbattocchi kbattocchi force-pushed the kebatt/modelSelection branch from 356f36e to fe1c5e1 Compare November 8, 2023 07:00
@kbattocchi kbattocchi requested a review from fverac November 8, 2023 22:20
Signed-off-by: Keith Battocchi <kebatt@microsoft.com>
@kbattocchi kbattocchi force-pushed the kebatt/modelSelection branch from 27acdc0 to 6d41ada Compare November 9, 2023 15:13
Signed-off-by: Keith Battocchi <kebatt@microsoft.com>
@kbattocchi kbattocchi marked this pull request as ready for review November 9, 2023 15:51
Signed-off-by: Keith Battocchi <kebatt@microsoft.com>
fit_with_groups(self._model, self._combine(X, W, Target.shape[0]), Target, groups=groups)
return self
class _FirstStageWrapper:
def __init__(self, model, discrete_target):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the refactoring. There is a consequence of not being able to make model-specific (Y vs T) warnings though (like the binary outcomes PR tries to do in this class).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's think about the cleanest way to do that as part of your PR; one thing to keep in mind is that we also use this wrapper for some Z models, so it's not just T vs. Y, which is what we assumed when we first implemented this logic when we only had DML but not the IV classes.



class _FirstStageWrapper:
def __init__(self, model, is_Y, featurizer, linear_first_stages, discrete_treatment):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is linear_first_stages used anymore?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No; linear_first_stages has always been an awkward hack and this gives us a good opportunity to get rid of it. Given

T = f(X, W) + error
Y = T <alpha phi(X)> + g(X,W) + error

if both f and g are linear, then because we multiply T by theta in the equation for Y, Y is not actually a linear function of X and W, we need to include interaction terms. So we introduced linear_first_stages because our default y and t models were linear and we wanted to make sure we could estimate the correct model if f and g were linear. Now, since we do model selection, the y and t models are not (necessarily) linear, so this makes less sense.

There's also a potentially big performance cost if users accidentally leave linear_first_stages set to True, because we can end up generating a huge number of interaction terms, and we see this happen fairly often in practice.

Additionally, linear models aren't the only time that specifying the right models for f and g as your t- and y-models wouldn't give the correct estimate - if f and g are both degree-n polynomials then you'd need to interact the degree-n interaction terms of [X;W] with the featurized X, for example, and we've never supported that; users can always supply a more complicated pipelined model that does something like that themselves if they'd like, but I'd argue it doesn't make sense to try to solve it on our end.

class ModelSelector(metaclass=abc.ABCMeta):
"""
This class enables a two-stage fitting process, where first a model is selected
by calling `train` with `is_selecting=True`, and then the selected model is fit (presumably
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am wondering why we opt for a 'train' method over a 'fit' method

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At first I used fit, but this makes the error messages much easier to understand if a selector is being passed where an estimator is expected or vice-versa.

return None


def _fit_with_groups(model, X, y, *, groups, **kwargs):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

marking for discussion

@@ -716,13 +741,19 @@ class LinearDML(StatsModelsCateEstimatorMixin, DML):

def __init__(self, *,
model_y='auto', model_t='auto',
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably update our docstrings to say that users can pass something like 'forest' to model_y right?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think we should probably create a whole new documentation section on how to specify model selectors, because there are a number of possibilities now. But I was planning on deferring that until after this PR.

@@ -202,13 +204,17 @@ def predict(self, X, y, W=None):

"""
model_list = []

kwargs = filter_none_kwargs(**kwargs)
model.train(True, *args, **kwargs)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think in some earlier conversations we were thinking about giving the users the option to do "dirty crossfitting" i.e. picking a good est from all data before cross fitting. Am I correct in my understanding that this PR just does "dirty crossfitting" by default?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, and that's definitely something we could consider making easier for users.

It's possible, though not straightforward, to do non-dirty crossfitting now, by wrapping a CV estimator in a FixedModelSelector, which will always use the estimator as is for both selecting and fitting. However, there are some changes we could make to make this more efficient, since then the selecting step is unnecessary and so we could just skip it.

I'd propose tabling that for now and implementing that as one of several future enhancements to the model selection logic.



class _FirstStageWrapper:
def __init__(self, model, is_Y, featurizer, linear_first_stages, discrete_treatment):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also do we no longer featurizer our first stage models?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We only ever used featurization in the first stage models to support linear_first_stages, since then we needed to interact phi(X) with [X;W].

Logically, the featurization is only about the final model, the first stage models just get the raw X and W and can fit whatever kind of model they'd like on them, including one that pipelines featurization with any other logic.

@fverac
Copy link
Collaborator

fverac commented Nov 9, 2023

General comment not just about this PR but of process of adding new features - I feel like a demo notebook (even if brief) would help users get up to speed with new features we add. Maybe we can add demo notebooks before moving from a beta release to an official release?
.e.g. how does one make use of the new param_list arg?

Signed-off-by: Keith Battocchi <kebatt@microsoft.com>
Signed-off-by: Keith Battocchi <kebatt@microsoft.com>
Signed-off-by: Keith Battocchi <kebatt@microsoft.com>
Signed-off-by: Keith Battocchi <kebatt@microsoft.com>
Signed-off-by: Keith Battocchi <kebatt@microsoft.com>
Signed-off-by: Keith Battocchi <kebatt@microsoft.com>
Signed-off-by: Keith Battocchi <kebatt@microsoft.com>
Signed-off-by: Keith Battocchi <kebatt@microsoft.com>
Signed-off-by: Keith Battocchi <kebatt@microsoft.com>
Signed-off-by: Keith Battocchi <kebatt@microsoft.com>
Signed-off-by: Keith Battocchi <kebatt@microsoft.com>
Signed-off-by: Keith Battocchi <kebatt@microsoft.com>
@kbattocchi kbattocchi enabled auto-merge (squash) November 11, 2023 17:36
@kbattocchi kbattocchi merged commit e335d15 into main Nov 11, 2023
72 checks passed
@kbattocchi kbattocchi deleted the kebatt/modelSelection branch November 11, 2023 19:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants