Skip to content

Instance-based uncertainty estimation for gradient-boosted regression trees

License

Notifications You must be signed in to change notification settings

jjbrophy47/ibug

Repository files navigation

IBUG: Instance-Based Uncertainty Estimation for Gradient-Boosted Regression Trees

PyPi version Python version Github License Build

IBUG is a simple wrapper that extends any gradient-boosted regression trees (GBRT) model into a probabilistic estimator, and is compatible with all major GBRT frameworks including LightGBM, XGBoost, CatBoost, and SKLearn.

thumbnail

Install

pip install ibug

Quickstart

from ibug import IBUGWrapper
from xgboost import XGBRegressor
from sklearn.datasets import load_diabetes
from sklearn.model_selection import train_test_split

# load diabetes dataset
data = load_diabetes()
X, y = data['data'], data['target']

# create train/val/test split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.1, random_state=1)
X_train, X_val, y_train, y_val = train_test_split(X_train, y_train, test_size=0.1, random_state=1)

# train GBRT model
model = XGBRegressor().fit(X_train, y_train)

# extend GBRT model into a probabilistic estimator
prob_model = IBUGWrapper().fit(model, X_train, y_train, X_val=X_val, y_val=y_val)

# predict mean and variance for unseen instances
location, scale = prob_model.pred_dist(X_test)

# return k highest-affinity neighbors for more flexible posterior modeling
location, scale, train_idxs, train_vals = prob_model.pred_dist(X_test, return_kneighbors=True)

License

Apache License 2.0.

Reference

Brophy and Lowd. Instance-Based Uncertainty Estimation for Gradient-Boosted Regression Trees. NeurIPS 2022.

@inproceedings{brophy2022ibug,
  title={Instance-Based Uncertainty Estimation for Gradient-Boosted Regression Trees},
  author={Brophy, Jonathan and Lowd, Daniel},
  booktitle={International Conference on Neural Information Processing Systems},
  year={2022}
}