Re-introduce double buffer in UpdatePosition, to fix perf regression in gpu_hist #6757

hcho3 · 2021-03-17T04:45:32Z

Closes #6552 by partially reverting the commit f779980

This reverts commit f779980.

hcho3 · 2021-03-17T04:46:28Z

src/tree/gpu_hist/row_partitioner.cu

-  dh::TemporaryArray<bst_node_t> position_temp(position_a_.size());
-  dh::TemporaryArray<RowIndexT> ridx_temp(ridx_a_.size());


I suspect building TemporaryArray at every invocation of UpdatePosition gets expensive in the particular example provided.

trivialfis

Could you please share some benchmark results?

hcho3 · 2021-03-17T05:03:51Z

@trivialfis The posted example in #6552 won't complete in a reasonable amount of time with the latest master (and 1.3.0). I can leave it up over night and see if it finishes by tomorrow.

On the other hand, here's the result with the proposed fix:

[0]     convergence-merror:0.10066
[1]     convergence-merror:0.10053
[2]     convergence-merror:0.09998
[3]     convergence-merror:0.09928
[4]     convergence-merror:0.09851
[5]     convergence-merror:0.09770
[6]     convergence-merror:0.09689
[7]     convergence-merror:0.09601
[8]     convergence-merror:0.09517
[9]     convergence-merror:0.09438
Time elapsed = 673.7375219200039 s

trivialfis · 2021-03-17T05:14:54Z

Thanks for sharing, what about more conventional dataset?

trivialfis · 2021-03-17T05:17:40Z

I can leave it up over night and see if it finishes by tomorrow.

Nah, no need. I'm just worrying about there might be regression on other types of data.

hcho3 · 2021-03-17T05:21:39Z

what about more conventional dataset?

Any suggestion? Should I try gbm-bench?

src/tree/gpu_hist/row_partitioner.cu

src/tree/gpu_hist/row_partitioner.cuh

trivialfis · 2021-03-17T12:00:41Z

Any suggestion? Should I try gbm-bench?

That would be great!

trivialfis

Looks good as long as there's no regression on other datasets.

hcho3 · 2021-03-18T20:55:33Z

@trivialfis Here is the benchmark results on gbm-bench. I do not see any performance degradation:

Runtime (s)	Before	After
airline	40.44	40.57
bosch	2.77	2.77
covtype	4.37	4.38
epsilon	16.12	16.19
fraud	0.44	0.41
higgs	6.50	6.53
year	0.75	0.72

Revert "gpu_hist performance tweaks (dmlc#5707)"

df1f822

This reverts commit f779980.

hcho3 commented Mar 17, 2021

View reviewed changes

hcho3 requested review from RAMitchell and trivialfis March 17, 2021 04:46

hcho3 added Blocking status: need review labels Mar 17, 2021

hcho3 mentioned this pull request Mar 17, 2021

Training behaviour difference between v1.1.0 and v1.3.1 #6552

Closed

trivialfis reviewed Mar 17, 2021

View reviewed changes

RAMitchell reviewed Mar 17, 2021

View reviewed changes

src/tree/gpu_hist/row_partitioner.cu Outdated Show resolved Hide resolved

src/tree/gpu_hist/row_partitioner.cuh Outdated Show resolved Hide resolved

hcho3 added 2 commits March 17, 2021 16:35

Address reviewer's comment

143cbff

Fix build error

46799f0

RAMitchell approved these changes Mar 18, 2021

View reviewed changes

trivialfis approved these changes Mar 18, 2021

View reviewed changes

hcho3 removed the status: need review label Mar 18, 2021

hcho3 merged commit 4230dcb into dmlc:master Mar 18, 2021

hcho3 deleted the use_double_buffer branch March 18, 2021 20:56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Re-introduce double buffer in UpdatePosition, to fix perf regression in gpu_hist #6757

Re-introduce double buffer in UpdatePosition, to fix perf regression in gpu_hist #6757

hcho3 commented Mar 17, 2021 •

edited

Loading

hcho3 Mar 17, 2021

trivialfis left a comment

hcho3 commented Mar 17, 2021 •

edited

Loading

trivialfis commented Mar 17, 2021 •

edited

Loading

trivialfis commented Mar 17, 2021

hcho3 commented Mar 17, 2021

trivialfis commented Mar 17, 2021

trivialfis left a comment

hcho3 commented Mar 18, 2021

		dh::TemporaryArray<bst_node_t> position_temp(position_a_.size());
		dh::TemporaryArray<RowIndexT> ridx_temp(ridx_a_.size());

Re-introduce double buffer in UpdatePosition, to fix perf regression in gpu_hist #6757

Re-introduce double buffer in UpdatePosition, to fix perf regression in gpu_hist #6757

Conversation

hcho3 commented Mar 17, 2021 • edited Loading

hcho3 Mar 17, 2021

Choose a reason for hiding this comment

trivialfis left a comment

Choose a reason for hiding this comment

hcho3 commented Mar 17, 2021 • edited Loading

trivialfis commented Mar 17, 2021 • edited Loading

trivialfis commented Mar 17, 2021

hcho3 commented Mar 17, 2021

trivialfis commented Mar 17, 2021

trivialfis left a comment

Choose a reason for hiding this comment

hcho3 commented Mar 18, 2021

hcho3 commented Mar 17, 2021 •

edited

Loading

hcho3 commented Mar 17, 2021 •

edited

Loading

trivialfis commented Mar 17, 2021 •

edited

Loading