feat: add `handle_zero` option in `ZeroInflatedRegressor` estimator #714

mkalimeri · 2024-11-07T12:46:11Z

Hi, please review my code for issue 480 - Run with given regressor instead of raising warning in ZeroInflatedRegressor

There are changes in two files

zero_inflated_regressor.py: As explained in my comment, I approached this issue in the following way

I added a flag 'handle_error', which can take two values: 'error' and 'ignore'. If the user chooses 'error', then, in the case that the train set output only consists of zeros, a ValueError is thrown. If the user chooses 'ignore' and if all the train set outputs are zero, the regressor fits all the train set (the flag handle_error='ignore' is taken into consideration only if there are no non-zero targets). So, when there are only zero targets and the user chooses handle_error='ignore', the regressor will train on the whole dataset, but if there are non-zero targets, the functionality remains as is (ignore lines with zero target)
The handle_error='error' as default, so the user should actively choose to ignore the fact that the train dataset contains only zero targets.

test_zero_inflated_regressor.py: Added unit tests for the new functionality

Please let me know what you think and if you have any suggestions about how to improve the code!

Implementation of solution for issue 480: [FEATURE] Run with given regressor instead of raising warning in ZeroInflatedRegressor

Implementation of unitests for issue 480: [FEATURE] Run with given regressor instead of raising warning in ZeroInflatedRegressor

koaning · 2024-11-07T12:58:54Z

Before I dive deeper, could you check and confirm why the tests fail? Just want to make sure it's not an oversight.

mkalimeri · 2024-11-07T13:05:15Z

Hi Vincent, sorry about that, I was just looking at this. There is a mismatch in the message of the ValueError thrown (see below). Best regards, Maria [image: Screenshot 2024-11-07 at 14.04.12.png]

…

On Thu, Nov 7, 2024 at 1:59 PM vincent d warmerdam ***@***.***> wrote: Before I dive deeper, could you check and confirm why the tests fail? Just want to make sure it's not an oversight. — Reply to this email directly, view it on GitHub <#714 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ASO4IW7WEXBM6RKNC6FRFODZ7NP2NAVCNFSM6AAAAABRLD3KROVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDINRSGE3TQMJYGU> . You are receiving this because you authored the thread.Message ID: ***@***.***>

Fixed error in unit tests

…into issue-480

mkalimeri · 2024-11-07T13:56:31Z

Hi again, sorry about that... I fixed it locally, shall I send another PR? BR, Maria On Thu, Nov 7, 2024 at 2:04 PM Maria Kalimeri ***@***.***> wrote:

…

Hi Vincent, sorry about that, I was just looking at this. There is a mismatch in the message of the ValueError thrown (see below). Best regards, Maria [image: Screenshot 2024-11-07 at 14.04.12.png] On Thu, Nov 7, 2024 at 1:59 PM vincent d warmerdam < ***@***.***> wrote: > Before I dive deeper, could you check and confirm why the tests fail? > Just want to make sure it's not an oversight. > > — > Reply to this email directly, view it on GitHub > <#714 (comment)>, > or unsubscribe > <https://github.com/notifications/unsubscribe-auth/ASO4IW7WEXBM6RKNC6FRFODZ7NP2NAVCNFSM6AAAAABRLD3KROVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDINRSGE3TQMJYGU> > . > You are receiving this because you authored the thread.Message ID: > ***@***.***> >

FBruzzesi

Hey @mkalimeri , thanks for the contribution. Code is fine, I left a few comments mostly for documentation and error messages.

Additionally, could you please add a test that is not raising when training with y consisting of zeros only? i.e. with handle_zero='ignore'?

I also took the liberty to change the PR title, to have some more context from the title directly

sklego/meta/zero_inflated_regressor.py

Accepted edits on messages/comments as suggested Co-authored-by: Francesco Bruzzesi <42817048+FBruzzesi@users.noreply.github.com>

mkalimeri · 2024-11-11T10:31:58Z

Thank you for the review, I accepted the edits as suggested and will add the unit test shortly

ValueError text for failure if handle_zero value is not one of ['ignore', 'error'] was updated. The relevant unittest had to be updated too

…into issue-480

…e 0, no exception is thrown If all train set outputs are 0 and handle_zero = 'ignore', the regressor should fit the values as is and no exception should be thrown

FBruzzesi

Hey @mkalimeri thanks a ton for the effort on this 🚀

I committed a couple of changes to move handle_zero argument to the estimator initialization instead of its fit method. Aside that your logic remained untouched!

We should be ready to merge 🙌🏼

mkalimeri · 2024-11-13T10:59:28Z

Hi @FBruzzesi, thank you for improving the code and merging. This was fun! :-)

mkalimeri and others added 3 commits November 6, 2024 15:13

Update test_zero_inflated_regressor.py

5bdd3ba

Implementation of solution for issue 480: [FEATURE] Run with given regressor instead of raising warning in ZeroInflatedRegressor

Update test_zero_inflated_regressor.py

d0129dc

Implementation of unitests for issue 480: [FEATURE] Run with given regressor instead of raising warning in ZeroInflatedRegressor

Merge branch 'koaning:main' into issue-480

6470f26

mkalimeri added 2 commits November 7, 2024 14:48

Update test_zero_inflated_regressor.py

fb202a5

Fixed error in unit tests

Merge branch 'issue-480' of https://github.com/mkalimeri/scikit-lego …

85b2617

…into issue-480

FBruzzesi reviewed Nov 7, 2024

View reviewed changes

FBruzzesi changed the title ~~Issue 480~~ feat: add handle_zero option in ZeroInflatedRegressor estimator Nov 7, 2024

Apply suggestions from code review

93203df

Accepted edits on messages/comments as suggested Co-authored-by: Francesco Bruzzesi <42817048+FBruzzesi@users.noreply.github.com>

mkalimeri and others added 5 commits November 11, 2024 12:20

Unit test update to match the updated ValueError output

5c4c723

ValueError text for failure if handle_zero value is not one of ['ignore', 'error'] was updated. The relevant unittest had to be updated too

Merge branch 'issue-480' of https://github.com/mkalimeri/scikit-lego …

e21c160

…into issue-480

Unittest that asserts that if handle_zero='ignore' and all outputs ar…

62b804b

…e 0, no exception is thrown If all train set outputs are 0 and handle_zero = 'ignore', the regressor should fit the values as is and no exception should be thrown

Merge branch 'main' into issue-480

a007c15

move handle_zero to init

b02fb7d

FBruzzesi approved these changes Nov 13, 2024

View reviewed changes

FBruzzesi merged commit 3d51e7b into koaning:main Nov 13, 2024
14 checks passed

FBruzzesi mentioned this pull request Nov 13, 2024

[FEATURE] Run with given regressor instead of raising warning in ZeroInflatedRegressor #480

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add `handle_zero` option in `ZeroInflatedRegressor` estimator #714

feat: add `handle_zero` option in `ZeroInflatedRegressor` estimator #714

mkalimeri commented Nov 7, 2024

koaning commented Nov 7, 2024

mkalimeri commented Nov 7, 2024 via email

mkalimeri commented Nov 7, 2024 via email

FBruzzesi left a comment •

edited

Loading

mkalimeri commented Nov 11, 2024

FBruzzesi left a comment

mkalimeri commented Nov 13, 2024

feat: add handle_zero option in ZeroInflatedRegressor estimator #714

feat: add handle_zero option in ZeroInflatedRegressor estimator #714

Conversation

mkalimeri commented Nov 7, 2024

koaning commented Nov 7, 2024

mkalimeri commented Nov 7, 2024 via email

mkalimeri commented Nov 7, 2024 via email

FBruzzesi left a comment • edited Loading

Choose a reason for hiding this comment

mkalimeri commented Nov 11, 2024

FBruzzesi left a comment

Choose a reason for hiding this comment

mkalimeri commented Nov 13, 2024

feat: add `handle_zero` option in `ZeroInflatedRegressor` estimator #714

feat: add `handle_zero` option in `ZeroInflatedRegressor` estimator #714

FBruzzesi left a comment •

edited

Loading