Minor refactor of split evaluation in gpu_hist #3889

RAMitchell · 2018-11-10T03:28:19Z

Move split evaluation logic into each device instead of having this logic "globally". This is in preparation for a future PR where each device will propose splits based on its subset of the data, each device will vote and then the algorithm will globally calculate the optimal split for the selected feature.

Use span class in split evaluation for improved memory safety.

RAMitchell · 2018-11-11T21:58:26Z

@hcho3 I am having issues with the Jenkins machines, do you know what's going on?

hcho3 · 2018-11-11T22:12:13Z

@RAMitchell Somehow auto scaling isn't working. For now, I manually provisioned workers.

hcho3 · 2018-11-11T22:28:36Z

@RAMitchell Can you do me a favor and write 1-2 sentence summary of what's being done here?

RAMitchell · 2018-11-11T22:37:42Z

Thanks added a description above. I will try to be a little more verbose in future.

trivialfis · 2018-11-11T22:52:22Z

src/tree/updater_gpu_hist.cu

  }

-  void InitRoot(RegTree* p_tree) {
+  void InitRoot(DMatrix* p_fmat, RegTree* p_tree) {


Currently p_fmat is not used in InitRoot. If it will be used in the future then it's fine.

trivialfis · 2018-11-11T23:02:38Z

src/tree/updater_gpu_hist.cu

+   */
+  void BuildHistWithSubtractionTrick(int nidx_parent, int nidx_left,
+                                     int nidx_right) {
+    auto smallest_nidx =


Nice, now we have a better naming.

trivialfis · 2018-11-11T23:05:12Z

src/tree/updater_gpu_hist.cu

+    dh::safe_cuda(cudaSetDevice(device_id_));
+    auto d_split_candidates = temp_memory.GetSpan<DeviceSplitCandidate>(feature_set.Size());
+    DeviceNodeStats node(node_sum_gradients[nidx], nidx, param);
+    feature_set.Reshard(GPUSet::Range(device_id_, 1));


I assume in the future there will be one feature_set for each device and the correspondence won't change, otherwise Reshard will get us into trouble.

Yes we will have to deal with this in the next PR. I am wondering if we can reshard it such that a copy exists on each device.

trivialfis · 2018-11-11T23:11:12Z

src/tree/updater_gpu_hist.cu

                                          TempStorageT* temp_storage) {
  __shared__ cub::Uninitialized<GradientPairSumT> uninitialized_sum;
  GradientPairSumT& shared_sum = uninitialized_sum.Alias();

  GradientPairSumT local_sum = GradientPairSumT();
  // For loop sums features into one block size
+  auto begin = feature_histogram.data();


Actually we might be able to use non-pointer iteration with this line:

xgboost/src/common/span.h

Line 144 in be0bb7d

SPAN_CHECK(index_ != span_->size());

change to use <,
and remove the check in this function:

xgboost/src/common/span.h

Line 177 in be0bb7d

XGBOOST_DEVICE SpanIterator& operator+=(difference_type n) {

The error report will be delayed to applying * operator and -> operator, but it will provide us some flexibility. WDYT?

thvasilo · 2018-11-12T14:53:06Z

Hello @RAMitchell, just curious will the future PR implement the PV-Tree described here?

RAMitchell · 2018-11-12T21:45:23Z

@thvasilo Yes something like that, perhaps not exactly.

thvasilo · 2018-11-13T09:37:13Z

@RAMitchell that's cool I'm interested in seeing how it turns out. The PV-Tree had issues with scaling and their accuracy experiments were limited unfortunately. Doing away completely with histogram communication still seems like a big challenge.

RAMitchell · 2018-11-13T11:14:44Z

@thvasilo I would be interested if you have done some experiments to suggest pv-tree is not accurate enough. My intuition was that choosing slightly suboptimal features sometimes will not affect convergence much.

thvasilo · 2018-11-13T11:36:12Z

@RAMitchell I'm mostly referring to the quality of evaluation in the paper. Only two closed-source datasets where used, and Figures 1-3 show deltas in NDCG and AUC in the order of 10^-2 to 10^-3. Whether that of difference is useful is up to the user/debatable.

For scalability see Section 5.2.1, Figure 2 is hard to interpret, a strong/weak scaling linear plot would have made more sense. In any case, just food for thought.

RAMitchell added 2 commits November 9, 2018 16:01

Refactor evaluate split into shard

32a4272

Use span in evaluate split

fffaa44

RAMitchell requested a review from trivialfis November 10, 2018 03:28

trivialfis reviewed Nov 11, 2018

View reviewed changes

trivialfis approved these changes Nov 11, 2018

View reviewed changes

Update google tests

cc9a68c

RAMitchell force-pushed the multi-gpu branch from d1bb588 to cc9a68c Compare November 12, 2018 23:18

trivialfis mentioned this pull request Nov 13, 2018

model trained is different with tree_method "gpu_hist" between on single GPU and on multi-GPU? #3887

Closed

RAMitchell merged commit 926eb65 into dmlc:master Nov 13, 2018

RAMitchell deleted the multi-gpu branch November 13, 2018 11:15

lock bot locked as resolved and limited conversation to collaborators Feb 11, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Minor refactor of split evaluation in gpu_hist #3889

Minor refactor of split evaluation in gpu_hist #3889

RAMitchell commented Nov 10, 2018 •

edited

Loading

RAMitchell commented Nov 11, 2018

hcho3 commented Nov 11, 2018

hcho3 commented Nov 11, 2018

RAMitchell commented Nov 11, 2018

trivialfis Nov 11, 2018

trivialfis Nov 11, 2018

trivialfis Nov 11, 2018

RAMitchell Nov 11, 2018

trivialfis Nov 11, 2018

thvasilo commented Nov 12, 2018

RAMitchell commented Nov 12, 2018

thvasilo commented Nov 13, 2018

RAMitchell commented Nov 13, 2018

thvasilo commented Nov 13, 2018

Minor refactor of split evaluation in gpu_hist #3889

Minor refactor of split evaluation in gpu_hist #3889

Conversation

RAMitchell commented Nov 10, 2018 • edited Loading

RAMitchell commented Nov 11, 2018

hcho3 commented Nov 11, 2018

hcho3 commented Nov 11, 2018

RAMitchell commented Nov 11, 2018

trivialfis Nov 11, 2018

Choose a reason for hiding this comment

trivialfis Nov 11, 2018

Choose a reason for hiding this comment

trivialfis Nov 11, 2018

Choose a reason for hiding this comment

RAMitchell Nov 11, 2018

Choose a reason for hiding this comment

trivialfis Nov 11, 2018

Choose a reason for hiding this comment

thvasilo commented Nov 12, 2018

RAMitchell commented Nov 12, 2018

thvasilo commented Nov 13, 2018

RAMitchell commented Nov 13, 2018

thvasilo commented Nov 13, 2018

RAMitchell commented Nov 10, 2018 •

edited

Loading