XGBoost gpu_hist running slower than hist (on Higgs dataset and benchmark_tree.py) #5888

VigneshN1997 · 2020-07-14T05:15:52Z

Hi community,

I was performing GPU experiments on XGBoost C++ binary (v1.1.0) on single NVIDIA Tesla K80 GPU. There are 2 experiments I performed:

On higgs dataset
Using tree method hist, (default nthread taken)
command: xgboost empty.conf tree_method=hist booster=gbtree task=train 'train_path=data/higgs/train.csv?format=csv&label_column=0' num_round=50 max_depth=5 learning_rate=0.1 min_child_weight=1.0 reg_alpha=0 reg_lambda=0 in_split_loss=0 objective=binary:logistic model_out=higgs/cpu_latest.model

Boosting rounds (snippet of 10 rounds):
[21:31:00] [0]
[21:31:00] [1]
[21:31:01] [2]
[21:31:01] [3]
[21:31:02] [4]
[21:31:02] [5]
[21:31:03] [6]
[21:31:03] [7]
[21:31:03] [8]
[21:31:04] [9]
[21:31:04] [10]

Total time for 50 boosting rounds = 21 seconds

Using tree method gpu_hist,(default nthread taken)
command: xgboost empty.conf tree_method=gpu_hist booster=gbtree task=train 'train_path=data/higgs/train.csv?format=csv&label_column=0' num_round=50 max_depth=3 learning_rate=0.1 min_child_weight=1.0 reg_alpha=0 reg_lambda=0 in_split_loss=0 objective=binary:logistic model_out=higgs/gpu_latest.model

Boosting rounds (snippet of 10 rounds):
[21:32:09] [0]
[21:32:23] [1]
[21:32:36] [2]
[21:32:49] [3]
[21:33:03] [4]
[21:33:16] [5]
[21:33:30] [6]
[21:33:43] [7]
[21:33:56] [8]
[21:34:11] [9]
[21:34:24] [10]

Total time for 50 boosting rounds = 9 mins 43 sec
As you can observe hist is taking lesser time per boosting round compared to gpu_hist.

On running the benchmark in tests/benchmark/benchmark_tree.py

hist: 20 boosting rounds: Train Time: 31.67970037460327 seconds
gpu_hist: 20 boosting rounds: Train Time: 3.3874778747558594 seconds

Here gpu_hist tree method is running faster.

I wanted to know why on Higgs dataset is hist method running faster and on benchmark gpu_hist faster? Is it that the GPU is getting under-utilised in some way because of which boosting ends up taking more time?

I checked the gpu_hist paper there in the comparison on Higgs dataset, gpu_hist is performing faster compared to hist. But the comparison done there is between hist method on 64 CPU cores and gpu hist on 8 GPUs. So is the experiment I ran slower because of execution on a single GPU?

Also this issue might be linked to issue: #3315

trivialfis · 2020-07-14T10:30:57Z

9 minutes is something went really wrong. I'm expecting 9 seconds. Could you provide a complete script for running on higgs?

trivialfis · 2020-07-14T10:32:55Z

I see that you are using CLI, will try to reproduce.

RAMitchell · 2020-07-15T04:45:35Z

Just a note that K80 is an older architecture, support may be removed soon.

VigneshN1997 · 2020-07-15T05:22:45Z

If I want to do a fair comparison between gpu hist and hist, what would be a good way to do it? As in for hist how many threads should be given and for gpu hist how many GPUs? Or if I want to do a single GPU gpu hist run, then what should be the setup for cpu hist?

VigneshN1997 · 2020-07-23T04:50:52Z

Is it the case that since gpu_hist is designed to work on multi GPUs where each GPU processes a subset of training instances, so on a single GPU, it divides the data into chunks and processes the chunks sequentially? Because of this, the desired speedup expected on gpu_hist is not observed on a single GPU?

trivialfis · 2020-07-23T05:36:52Z

No, see my perf: #5926 That's what I'm expecting. Not sure about issue here.

trivialfis · 2020-08-11T11:33:17Z

I can reproduce the issue on CLI. Will investigate.

hcho3 · 2020-11-12T02:32:44Z

@trivialfis Did you have a change to look at this? If not, can you tell me how you reproduced the issue?

trivialfis · 2021-11-02T06:22:20Z

No longer reproduce with higgs.

trivialfis self-assigned this Aug 11, 2020

trivialfis closed this as completed Nov 2, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

XGBoost gpu_hist running slower than hist (on Higgs dataset and benchmark_tree.py) #5888

XGBoost gpu_hist running slower than hist (on Higgs dataset and benchmark_tree.py) #5888

VigneshN1997 commented Jul 14, 2020

trivialfis commented Jul 14, 2020

trivialfis commented Jul 14, 2020

RAMitchell commented Jul 15, 2020

VigneshN1997 commented Jul 15, 2020

VigneshN1997 commented Jul 23, 2020

trivialfis commented Jul 23, 2020 •

edited

Loading

trivialfis commented Aug 11, 2020

hcho3 commented Nov 12, 2020

trivialfis commented Nov 2, 2021

XGBoost gpu_hist running slower than hist (on Higgs dataset and benchmark_tree.py) #5888

XGBoost gpu_hist running slower than hist (on Higgs dataset and benchmark_tree.py) #5888

Comments

VigneshN1997 commented Jul 14, 2020

trivialfis commented Jul 14, 2020

trivialfis commented Jul 14, 2020

RAMitchell commented Jul 15, 2020

VigneshN1997 commented Jul 15, 2020

VigneshN1997 commented Jul 23, 2020

trivialfis commented Jul 23, 2020 • edited Loading

trivialfis commented Aug 11, 2020

hcho3 commented Nov 12, 2020

trivialfis commented Nov 2, 2021

trivialfis commented Jul 23, 2020 •

edited

Loading