-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature] Support for monotonic constraints? #14
Comments
I'm pasting the snippets for the monotonic constraints here
@alexvorobiev , do you have referable papers for this features? |
@chivee I only have the reference to the R GBM package https://cran.r-project.org/package=gbm |
@alexvorobiev , thanks for your sharing. I'm trying to get the idea behind this method. |
Note that the given pseudo code only ensures the split to be in the correct order and not the whole model as a later split could lead the model to be non monotonic |
Any thoughts on this? |
From practical perspective (outside kaggle-world!), this feature would be extremely helpful in many applications where reasonable model behavior is relevant. |
@guolinke Would you be able to advise how to approach this and whether it's feasible? I.e., where should it belong, would it be sufficient to implement it just somewhere in Here's the meat of the implementation in XGBoost, for reference: https://github.com/dmlc/xgboost/blob/master/src/tree/param.h#L422 -- all of it pretty much contained in |
@aldanor following may is useful: The split gain calculation: https://github.com/Microsoft/LightGBM/blob/master/src/treelearner/feature_histogram.hpp#L291-L297 The leaf-output calculation: |
@guolinke I may add some links here about the implementation in XGBoost: |
@guolinke Monotonic constraints may be a very important requirement for the resulting models. For many reasons: e.g., as noted above, there could be domain knowledge that must be respected - e.g., in insurance and risk management problems. How about we all cooperate and make this work? |
@aldanor very cool, would like to work together with it. |
It seems the MC(Monotonic constraints) could be cumulative, that is, if both model A and B is MC, then A+B is MC. combine @chivee 's pseudo code and @AbdealiJK 's suggestion. I think the algorithm is:
|
@aldanor would you like to create a PR first ? I can provide my help in the PR. |
@guolinke I will give it a try, yep. Your suggested algorithm in the snippet above looks fine, that's kind of what like xgboost does (in exact mode though, not histogram; do you think there would be any complications here because of binning?) Where would this code belong then, Edit: what do you mean by |
@aldanor We need to update the calculation of gain: https://github.com/Microsoft/LightGBM/blob/master/src/treelearner/feature_histogram.hpp#L354-L357 and https://github.com/Microsoft/LightGBM/blob/master/src/treelearner/feature_histogram.hpp#L415-L418 . We may need to wrap these to a new function, and implement both non-constraint and MC for them. |
@aldanor any updates ? |
I would also be very interested in seeing this feature implemented in LightGBM. As aldanor stated above the Pseudo-code suggested earlier is correct and is how XGBoost implements monotonic constraints. As such this feature should be fairly trivial to implement for someone with an intimate knowledge of the codebase. |
< removed due to irrelevance> |
@j-mark-hou |
got it, I'll wait for someone with a better understanding of the codebase to implement this then. |
you can try #1314 |
Are you planning support for monotonic constraints? See e.g. here dmlc/xgboost#1514
The text was updated successfully, but these errors were encountered: