-
Notifications
You must be signed in to change notification settings - Fork 119
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Example for preprocessing.dictmapper.DictMapper
and meta.outlier_classifier.OutlierClassifier
#646
Conversation
Hey @anopsy, thanks for the PR. One suggestion I have (and I should add it in the issue referencing these examples) is to combine the examples as a unique one in the top level docstring for the class. In this way users will see the following sections: Parameters, Attributes, Example, and then the list of methods rather than single examples in each method. Example in documentation and its code.
This feels buggy as different columns could have share values that one could want to map to different values. It may be useful to either:
|
Thank you @FBruzzesi, |
… little mistake I made in outlier_remover
Hi, I have a question concerning committing from the Codespace. I opened Codespace from my fork, I run git pull origin examples (name of my branch), then added and committed the changes and then I run git push origin examples. The problem is I can't see the Compare&Pull Request button anywhere. I'm completely new to Codespace, tried to google that but couldn't find any specific answer. Should I be opening the Codespaces from scikit-lego original fork for the git push origin to work? I hope it's okay to ask this here. If not let me know and sorry for the inconvenience |
You probably pushed to your own repository on Github. If you look at your own branches do you see the branch that you've just worked on? |
Yes and it says: This branch is 5 commits ahead of koaning/scikit-lego:main.
That's convenient to know, thank you Francesco!
Yes those are mine! Cool, so it worked after all, thank you folks! |
sklego/meta/outlier_classifier.py
Outdated
@@ -52,6 +52,33 @@ def fit(self, X, y=None): | |||
ValueError | |||
- If the underlying model is not an outlier detection model. | |||
- If the underlying model does not have a `decision_function` method. | |||
|
|||
Example |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @anopsy, I have the same feedback as for DictMapper
: if you could move the example up in the docstring I think it would be easier and faster for folks to find when scrolling through the api documentation without the need to step down into the .fit(..)
method.
I think this example is ready to merge after that change
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will do that!
sklego/preprocessing/dictmapper.py
Outdated
@@ -43,6 +43,41 @@ def fit(self, X, y=None): | |||
------- | |||
self : DictMapper | |||
The fitted transformer. | |||
|
|||
Example |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you manage to add how to make it interact with either sklego.preprocessing.ColumnSelector
or sklearn.composeColumnTransformer
I believe it would be of great help and ready to merge
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this double dict was making me really uncomfortable 😅 I'm on it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did the fixes, but I'm unable to push them. I need to figure out what's going on and be back 😅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I was also confused. However, I finally figured out what was happening and was able to push this time. The problem was that one of the files, sklego.model_selection.py, which I wasn’t even working on, failed to pass the ruff-format and was reformatted by ruff. Since I hadn’t worked on it, I decided to revert that change (git restore) and attempted to commit only the two files I had been working on. But after pushing, I received the message “Everything-up-to-date” and didn’t see any changes on my branch. Today, I decided to accept the formatting changes made by ruff, and I was finally able to push. I hope the formatter didn’t break anything in model_selection.py. What’s the proper way to handle this kind of problem in the future?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good 🎉 Thanks for the changes!
preprocessing.dictmapper.DictMapper
and meta.outlier_classifier.OutlierClassifier
Example added to preprocessing.dictmapper.DictMapper is the correct pull request.
Hi folks,
I did today the example for DictMapper, this one felt a bit tricky.
Since the DictMapper uses a dictionary to map the values to int's I tried to think of a useful example, that will not mimic the LabelEncoder, but will input meaningful values (population of a city /ranking of a university) if I went to far and should go back to a more abstract example, let me know.
What caught my attention is that if you want to cover multiple columns in a df you need to create a dict that includes all of the values just as I did in my example. Is my intuition that this seems a bit clumsy correct?
I also had my doubts about what's the proper style to space between comments in the dict, this version got all the green checks from Codespace tools, so I went for that. Btw this time I checked how the docstrings rendered via mkdocs on the docs site ;)
I'm very curious what was the purpose of the DictMapper. :D
Before working on a large PR, please check with @FBruzzesi or @koaning to confirm that they agree with the direction of the PR. This discussion should take place in a Github issue before working on the PR, unless it's a minor change like spelling in the docs.
Description
Example of how to use DictMapper added to docstrings
Partially Fixes #596
Type of change
Checklist:
If you feel your PR is ready for a review, ping @FBruzzesi or @koaning.