Fitting Linear Functions inside Tree leaves (Feature Request) #5725

Fish-Soup · 2020-05-29T07:25:39Z

I was wondering if it where possible to develop an new booster, that instead of taking the mean of values inside a leaf instead fitted a linear function. In cases of lower numbers a features its possible that a piece-wise linear model will perform better than a tree based one. Requiring less leaves and trees to model smoothly changing functions. In certain cases this could produce higher accuracy predictions. An additional benefit is that it would allow extrapolation which may be important in certain use cases.

I have found two implementations of this

LinXGBoost: Is written in purely python and describes itself as an extension to XGBoost, However in the paper it mentions it hasn't been written with performance in mind.

https://github.com/ldv1/LinXGBoost
https://arxiv.org/pdf/1710.03634.pdf

GBDT-PL: Has a python API, think the back end is in C. This performs very well when compared to other gradient boosted decision trees. (at least on the tests/hyperparameters they chose). The paper details many optimisations to make the code run quickly .

https://github.com/GBDT-PL/GBDT-PL
https://arxiv.org/pdf/1802.05640.pdf

An additional optimization I had thought of was you could specify only a subset of the features to fit the linear fit to.

Many thanks
EDIT I fixed the broken links

xuyxu · 2020-05-30T02:07:02Z

Feature-request on supporting multi-output regression was mentioned before (#2087 #3439). It will bring substantial maintenance costs, as essentially what needs is to use another base learner.

To simulate "linear functions in leaf nodes", XGBoost:Regression+MultiOutputRegressor in sklearn works reasonably well, despite many paper claims that this solution ignores correlations between different target variables.

trivialfis · 2020-05-30T03:09:51Z

It's something I wanted for a long time. Also I have a proof of concept impl in #5460 . I just need to allocate time to focus on it.

Murgio · 2020-05-30T13:29:33Z

@AaronX121 Do you have any citations for the papers

... many paper claims ...

?

trivialfis · 2020-05-30T13:36:34Z

Actually correlation doesn't help much in my experiments. Your result is likely to be worse due to model capacity. That's one of the reasons that I'm not rushing the implementation. It's mostly for faster inference time.

xuyxu · 2020-05-30T14:03:58Z

@Murgio Hi, here is one work that directly tackles the multi-output regression problem for GBDT: https://arxiv.org/pdf/1909.04373.pdf, you may find it helpful :). Its code is available on GitHub. For sparse multi-output, here is another work: http://proceedings.mlr.press/v70/si17a.html.

Also, many variants of CART are equipped with linear models in internal or leaf nodes, such as piece-wise linear tree (https://arxiv.org/pdf/1802.05640.pdf), soft decision tree (https://arxiv.org/pdf/1711.09784.pdf), and many more. They can be easily combined with one gradient boosting wrapper for multi-output regression / multi-label classification.

I also ran experiments on some benchmark datasets (http://mulan.sourceforge.net/datasets-mtr.html). It is hard to say that these methods are superior to XGBoost+MultiOutputRegressor.

Fish-Soup · 2020-05-30T21:18:55Z

@AaronX121 I have possibly explained my request poorly or misunderstood your response but I don't see how what I asked for is equivalent to XGBoost plus Multioutputregressor. What I was requesting is described in the paper we both linked
https://arxiv.org/pdf/1802.05640.pdf
on piecwise linear tree which uses piecewise lin-
ear regression trees (PL Trees), instead of piece-
wise constant regression trees. I will have a look at the soft decision tree paper.

Ps I fixed my links in my initial request

LudvicLaberge · 2021-01-29T13:31:06Z

I second @Fish-Soup 's last question: how is XGBoost plus Multioutputregressor similar/equivalent to the initial feature request? Are there papers or examples contrasting both?

Went and read the Multioutputregressor docs and doesn't seem like the right solution. I have only one target, but I'd like the learners to be piecewise linear instead of step functions.

carloamodeo · 2024-03-21T17:09:57Z

Hello all,
Is this feature going to be implemented?

trivialfis added the feature-request label May 29, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fitting Linear Functions inside Tree leaves (Feature Request) #5725

Fitting Linear Functions inside Tree leaves (Feature Request) #5725

Fish-Soup commented May 29, 2020 •

edited

Loading

xuyxu commented May 30, 2020

trivialfis commented May 30, 2020

Murgio commented May 30, 2020

trivialfis commented May 30, 2020

xuyxu commented May 30, 2020

Fish-Soup commented May 30, 2020 •

edited

Loading

LudvicLaberge commented Jan 29, 2021

carloamodeo commented Mar 21, 2024

Fitting Linear Functions inside Tree leaves (Feature Request) #5725

Fitting Linear Functions inside Tree leaves (Feature Request) #5725

Comments

Fish-Soup commented May 29, 2020 • edited Loading

xuyxu commented May 30, 2020

trivialfis commented May 30, 2020

Murgio commented May 30, 2020

trivialfis commented May 30, 2020

xuyxu commented May 30, 2020

Fish-Soup commented May 30, 2020 • edited Loading

LudvicLaberge commented Jan 29, 2021

carloamodeo commented Mar 21, 2024

Fish-Soup commented May 29, 2020 •

edited

Loading

Fish-Soup commented May 30, 2020 •

edited

Loading