Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add handle_zero option in ZeroInflatedRegressor estimator #714

Merged
merged 11 commits into from
Nov 13, 2024

Conversation

mkalimeri
Copy link
Contributor

Hi, please review my code for issue 480 - Run with given regressor instead of raising warning in ZeroInflatedRegressor

There are changes in two files

  1. zero_inflated_regressor.py: As explained in my comment, I approached this issue in the following way
  • I added a flag 'handle_error', which can take two values: 'error' and 'ignore'. If the user chooses 'error', then, in the case that the train set output only consists of zeros, a ValueError is thrown. If the user chooses 'ignore' and if all the train set outputs are zero, the regressor fits all the train set (the flag handle_error='ignore' is taken into consideration only if there are no non-zero targets). So, when there are only zero targets and the user chooses handle_error='ignore', the regressor will train on the whole dataset, but if there are non-zero targets, the functionality remains as is (ignore lines with zero target)

  • The handle_error='error' as default, so the user should actively choose to ignore the fact that the train dataset contains only zero targets.

  1. test_zero_inflated_regressor.py: Added unit tests for the new functionality

Please let me know what you think and if you have any suggestions about how to improve the code!

mkalimeri and others added 3 commits November 6, 2024 15:13
Implementation of solution for issue 480: [FEATURE] Run with given regressor instead of raising warning in ZeroInflatedRegressor
Implementation of unitests for issue 480: [FEATURE] Run with given regressor instead of raising warning in ZeroInflatedRegressor
@koaning
Copy link
Owner

koaning commented Nov 7, 2024

Before I dive deeper, could you check and confirm why the tests fail? Just want to make sure it's not an oversight.

@mkalimeri
Copy link
Contributor Author

mkalimeri commented Nov 7, 2024 via email

@mkalimeri
Copy link
Contributor Author

mkalimeri commented Nov 7, 2024 via email

Copy link
Collaborator

@FBruzzesi FBruzzesi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @mkalimeri , thanks for the contribution. Code is fine, I left a few comments mostly for documentation and error messages.

Additionally, could you please add a test that is not raising when training with y consisting of zeros only? i.e. with handle_zero='ignore'?

I also took the liberty to change the PR title, to have some more context from the title directly

sklego/meta/zero_inflated_regressor.py Outdated Show resolved Hide resolved
sklego/meta/zero_inflated_regressor.py Outdated Show resolved Hide resolved
sklego/meta/zero_inflated_regressor.py Outdated Show resolved Hide resolved
sklego/meta/zero_inflated_regressor.py Outdated Show resolved Hide resolved
sklego/meta/zero_inflated_regressor.py Outdated Show resolved Hide resolved
sklego/meta/zero_inflated_regressor.py Outdated Show resolved Hide resolved
@FBruzzesi FBruzzesi changed the title Issue 480 feat: add handle_zero option in ZeroInflatedRegressor estimator Nov 7, 2024
Accepted edits on messages/comments as suggested

Co-authored-by: Francesco Bruzzesi <42817048+FBruzzesi@users.noreply.github.com>
@mkalimeri
Copy link
Contributor Author

Thank you for the review, I accepted the edits as suggested and will add the unit test shortly

mkalimeri and others added 5 commits November 11, 2024 12:20
ValueError text for failure if handle_zero value is not one of ['ignore', 'error'] was updated. The relevant unittest had to be updated too
…e 0, no exception is thrown

If all train set outputs are 0 and handle_zero = 'ignore', the regressor should fit the values as is and no exception should be thrown
Copy link
Collaborator

@FBruzzesi FBruzzesi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @mkalimeri thanks a ton for the effort on this 🚀

I committed a couple of changes to move handle_zero argument to the estimator initialization instead of its fit method. Aside that your logic remained untouched!

We should be ready to merge 🙌🏼

@FBruzzesi FBruzzesi merged commit 3d51e7b into koaning:main Nov 13, 2024
14 checks passed
@mkalimeri
Copy link
Contributor Author

Hi @FBruzzesi, thank you for improving the code and merging. This was fun! :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants