Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fitting Linear Functions inside Tree leaves (Feature Request) #5725

Open
Fish-Soup opened this issue May 29, 2020 · 8 comments
Open

Fitting Linear Functions inside Tree leaves (Feature Request) #5725

Fish-Soup opened this issue May 29, 2020 · 8 comments

Comments

@Fish-Soup
Copy link

Fish-Soup commented May 29, 2020

I was wondering if it where possible to develop an new booster, that instead of taking the mean of values inside a leaf instead fitted a linear function. In cases of lower numbers a features its possible that a piece-wise linear model will perform better than a tree based one. Requiring less leaves and trees to model smoothly changing functions. In certain cases this could produce higher accuracy predictions. An additional benefit is that it would allow extrapolation which may be important in certain use cases.

I have found two implementations of this

LinXGBoost: Is written in purely python and describes itself as an extension to XGBoost, However in the paper it mentions it hasn't been written with performance in mind.

https://github.com/ldv1/LinXGBoost
https://arxiv.org/pdf/1710.03634.pdf

GBDT-PL: Has a python API, think the back end is in C. This performs very well when compared to other gradient boosted decision trees. (at least on the tests/hyperparameters they chose). The paper details many optimisations to make the code run quickly .

https://github.com/GBDT-PL/GBDT-PL
https://arxiv.org/pdf/1802.05640.pdf

An additional optimization I had thought of was you could specify only a subset of the features to fit the linear fit to.

Many thanks
EDIT I fixed the broken links

@xuyxu
Copy link

xuyxu commented May 30, 2020

Feature-request on supporting multi-output regression was mentioned before (#2087 #3439). It will bring substantial maintenance costs, as essentially what needs is to use another base learner.

To simulate "linear functions in leaf nodes", XGBoost:Regression+MultiOutputRegressor in sklearn works reasonably well, despite many paper claims that this solution ignores correlations between different target variables.

@trivialfis
Copy link
Member

It's something I wanted for a long time. Also I have a proof of concept impl in #5460 . I just need to allocate time to focus on it.

@Murgio
Copy link

Murgio commented May 30, 2020

@AaronX121 Do you have any citations for the papers

... many paper claims ...

?

@trivialfis
Copy link
Member

Actually correlation doesn't help much in my experiments. Your result is likely to be worse due to model capacity. That's one of the reasons that I'm not rushing the implementation. It's mostly for faster inference time.

@xuyxu
Copy link

xuyxu commented May 30, 2020

@Murgio Hi, here is one work that directly tackles the multi-output regression problem for GBDT: https://arxiv.org/pdf/1909.04373.pdf, you may find it helpful :). Its code is available on GitHub. For sparse multi-output, here is another work: http://proceedings.mlr.press/v70/si17a.html.

Also, many variants of CART are equipped with linear models in internal or leaf nodes, such as piece-wise linear tree (https://arxiv.org/pdf/1802.05640.pdf), soft decision tree (https://arxiv.org/pdf/1711.09784.pdf), and many more. They can be easily combined with one gradient boosting wrapper for multi-output regression / multi-label classification.

I also ran experiments on some benchmark datasets (http://mulan.sourceforge.net/datasets-mtr.html). It is hard to say that these methods are superior to XGBoost+MultiOutputRegressor.

@Fish-Soup
Copy link
Author

Fish-Soup commented May 30, 2020

@AaronX121 I have possibly explained my request poorly or misunderstood your response but I don't see how what I asked for is equivalent to XGBoost plus Multioutputregressor. What I was requesting is described in the paper we both linked
https://arxiv.org/pdf/1802.05640.pdf
on piecwise linear tree which uses piecewise lin-
ear regression trees (PL Trees), instead of piece-
wise constant regression trees. I will have a look at the soft decision tree paper.

Ps I fixed my links in my initial request

@LudvicLaberge
Copy link

I second @Fish-Soup 's last question: how is XGBoost plus Multioutputregressor similar/equivalent to the initial feature request? Are there papers or examples contrasting both?

Went and read the Multioutputregressor docs and doesn't seem like the right solution. I have only one target, but I'd like the learners to be piecewise linear instead of step functions.

@carloamodeo
Copy link

Hello all,
Is this feature going to be implemented?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants