Vasilis/autoinference #307

vsyrgkanis · 2020-11-09T18:08:10Z

Made default='auto' for many estimators, so that default inference is enabled (fixes inference='auto' #205).
Enabled LinearModelFinalInference at the DMLCateEstimator level, so that if the user uses any model_final that has predict_interval and prediction_stderr they will get inference. This enables use of RLM as final stage with conf intervals.
Added an RLM scikit-learn wrapper.

This enables the following usecase:

from econml.sklearn_extensions.linear_model import StatsModelsRLM
est = DMLCateEstimator(model_y=LassoCV(),
                        model_t=LassoCV(),
                        model_final=StatsModelsRLM(t=1, maxiter=10000, tol=1e-12, fit_intercept=False),
                        featurizer=PolynomialFeatures(degree=1, include_bias=False),
                        random_state=123)
est.fit(Y, T, X, W)
te_pred = est.effect(X_test)
lb, ub= est.effect_interval(X_test)
est.summary()

… but rather create auxiliary numpy arrays that store the numerator and denominator of every node. This enables consistent feature_importance calculation and also potentially more accurate shap_values calcualtion.

…attribute

…ure_importances_. Added tests that the feature_importances_ API is working in test_drlearner and test_dml.

Co-authored-by: Keith Battocchi <kebatt@microsoft.com>

…onML into vasilis/feature_importances

…ree level was causing trouble, since due to sample splitting feature_improtance can many times be negative (increase in variance) due to honesty and sample splitting. Now averaging the un-normalized feature importance. There is still a small caveat in the current version of how we use impurity. Added that as a TODO.

…orest, that now makes feature_importances_ exactly correct and no need to re-implement the method. Now impurities are computed on the estimation sample and replacing the pre-calculated node impurities.

…rallel_add_trees_ of ensemble.py. This leads to 6 fold speed-up as we were doing many slicing operations to sparse matrices before, which are very slow!

…ear model final inference for DMLCateEstimator, to allow for RLM usage

kbattocchi

Love the idea of enabling inference by default when it's efficient. I don't see which changes enable arbitrary linear models in this set of commits - what am I missing?

econml/cate_estimator.py

…tsmodelsRLM

…ept in summary(). Fixed failing test_cate_interpreter by passing explicitly inference=None.

kbattocchi

Mostly fine, except for a weird whitespace issue.

econml/drlearner.py

…in summary.

… after zstat was fixed

kbattocchi

LGTM

econml/dml.py

Co-authored-by: Keith Battocchi <kebatt@microsoft.com>

…nto vasilis/autoinference

kbattocchi

Looks good, thanks!

vasilismsr and others added 17 commits November 7, 2020 08:56

replaced copy with empty_like

4f1fe19

added feature improtances in dr learner example notebook

e4aac6a

added feature_importances_ to DML example notebook

7744197

enabled feature_importances_ for forestDML and forestDRLearner as an …

ebae662

…attribute

fixed doctest in subsample honest forest which was producing old feat…

7128230

…ure_importances_. Added tests that the feature_importances_ API is working in test_drlearner and test_dml.

fixed missing .shape in new test_dml

cdf0cf9

Merge branch 'master' into vasilis/feature_importances

297a2dc

Update econml/sklearn_extensions/ensemble.py

7567f57

Co-authored-by: Keith Battocchi <kebatt@microsoft.com>

changed order of mixins

c9a0103

Merge branch 'vasilis/feature_importances' of github.com:microsoft/Ec…

baa27c2

…onML into vasilis/feature_importances

fixed docstring reference

5860a37

fixed the problem with inconsistent impurities in subsampled honest f…

10186d2

…orest, that now makes feature_importances_ exactly correct and no need to re-implement the method. Now impurities are computed on the estimation sample and replacing the pre-calculated node impurities.

fixed docstring doctest

5e3dd95

Transformed sparse matrices to dense matrices after dot product in pa…

92851a8

…rallel_add_trees_ of ensemble.py. This leads to 6 fold speed-up as we were doing many slicing operations to sparse matrices before, which are very slow!

turned inference to 'auto' for many estimators. Enabled arbitrary lin…

80ad7f4

…ear model final inference for DMLCateEstimator, to allow for RLM usage

vsyrgkanis requested review from heimengqi, kbattocchi and moprescu November 9, 2020 18:08

vsyrgkanis added the enhancement New feature or request label Nov 9, 2020

kbattocchi approved these changes Nov 9, 2020

View reviewed changes

econml/cate_estimator.py Show resolved Hide resolved

econml/cate_estimator.py Show resolved Hide resolved

vasilismsr and others added 6 commits November 9, 2020 15:43

Added StatsModelsRLM and added tests for testing the auto and the sta…

3f71a7a

…tsmodelsRLM

handle column y vector in RLM

23d26fe

Merge branch 'master' into vasilis/autoinference

ed17a31

fixed docstrings of RLM to allow for column y

bcee64b

lintinh

3b96e45

added explanatory text to summary(). Changed intercept to cate_interc…

958ad1f

…ept in summary(). Fixed failing test_cate_interpreter by passing explicitly inference=None.

vasilismsr added 6 commits November 11, 2020 08:58

fixed some wording in summary text

17b36b5

fixed some wording in summary text

9e616eb

fixed some wording in summary text

f73dedb

fixed some wording in summary text

eaae12b

fixed some wording in summary text

741a5bf

linting

3528fed

kbattocchi requested changes Nov 11, 2020

View reviewed changes

econml/drlearner.py Outdated Show resolved Hide resolved

vsyrgkanis and others added 4 commits November 11, 2020 12:46

Merge branch 'master' into vasilis/autoinference

dbe6c78

updated test_inference to check for cate_intercept key not intercpet …

04bef7c

…in summary.

linting

6cbc0fd

changed test_inference to check for correct zstat in degenerate cases…

6a13b3f

… after zstat was fixed

kbattocchi approved these changes Nov 11, 2020

View reviewed changes

kbattocchi requested changes Nov 11, 2020

View reviewed changes

econml/dml.py Outdated Show resolved Hide resolved

econml/dml.py Outdated Show resolved Hide resolved

econml/dml.py Outdated Show resolved Hide resolved

econml/dml.py Outdated Show resolved Hide resolved

econml/dml.py Outdated Show resolved Hide resolved

vsyrgkanis and others added 5 commits November 11, 2020 13:53

Update econml/dml.py

d21e86f

Co-authored-by: Keith Battocchi <kebatt@microsoft.com>

Update econml/dml.py

1c6925a

Co-authored-by: Keith Battocchi <kebatt@microsoft.com>

Update econml/dml.py

74b8bbf

Co-authored-by: Keith Battocchi <kebatt@microsoft.com>

dml groups

f98799e

Merge branch 'vasilis/autoinference' of github.com:microsoft/EconML i…

1b45f51

…nto vasilis/autoinference

kbattocchi approved these changes Nov 11, 2020

View reviewed changes

vsyrgkanis merged commit a79bdea into master Nov 11, 2020

vsyrgkanis deleted the vasilis/autoinference branch November 16, 2020 22:54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Vasilis/autoinference #307

Vasilis/autoinference #307

vsyrgkanis commented Nov 9, 2020 •

edited

Loading

kbattocchi left a comment

kbattocchi left a comment

kbattocchi left a comment

kbattocchi left a comment

Vasilis/autoinference #307

Vasilis/autoinference #307

Conversation

vsyrgkanis commented Nov 9, 2020 • edited Loading

kbattocchi left a comment

Choose a reason for hiding this comment

kbattocchi left a comment

Choose a reason for hiding this comment

kbattocchi left a comment

Choose a reason for hiding this comment

kbattocchi left a comment

Choose a reason for hiding this comment

vsyrgkanis commented Nov 9, 2020 •

edited

Loading