Distributed Fast Histogram Algorithm #4011

CodingCat · 2018-12-19T18:18:32Z

basically need to remove several implementation assumption in fast histogram algorithm

@hcho3 please help to review

I will test with our internal dataset

...ages/xgboost4j-spark/src/main/scala/ml/dmlc/xgboost4j/scala/spark/params/BoosterParams.scala

src/tree/updater_quantile_hist.cc

src/tree/updater_histmaker.cc

src/tree/param.h

src/common/hist_util.h

codecov-io · 2018-12-30T05:00:37Z

Codecov Report

Merging #4011 into master will increase coverage by 3.49%.
The diff coverage is 34.21%.

@@             Coverage Diff              @@
##             master    #4011      +/-   ##
============================================
+ Coverage     57.24%   60.73%   +3.49%     
============================================
  Files           190      130      -60     
  Lines         15045    11718    -3327     
  Branches        527        0     -527     
============================================
- Hits           8612     7117    -1495     
+ Misses         6176     4601    -1575     
+ Partials        257        0     -257

Impacted Files	Coverage Δ	Complexity Δ
src/learner.cc	`25.96% <0%> (-0.22%)`	`0 <0> (ø)`
src/tree/updater_histmaker.cc	`2.91% <0%> (+0.01%)`	`0 <0> (ø)`	⬇️
src/tree/updater_refresh.cc	`98.76% <100%> (ø)`	`0 <0> (ø)`	⬇️
src/common/hist_util.cc	`42.9% <100%> (+0.17%)`	`0 <0> (ø)`	⬇️
src/tree/updater_quantile_hist.h	`48% <100%> (+2.16%)`	`0 <0> (ø)`	⬇️
src/tree/updater_quantile_hist.cc	`34.22% <38.88%> (-0.09%)`	`0 <0> (ø)`
src/common/hist_util.h	`78.84% <50%> (-2.41%)`	`0 <0> (ø)`
src/linear/updater_shotgun.cc	`91.07% <0%> (-2.6%)`	`0% <0%> (ø)`
src/linear/updater_coordinate.cc	`100% <0%> (ø)`	`0% <0%> (ø)`	⬇️
tests/cpp/test_learner.cc	`100% <0%> (ø)`	`0% <0%> (ø)`	⬇️
... and 69 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 84c99f8...5af2a6a. Read the comment docs.

CodingCat · 2019-01-03T18:47:45Z

ping? @RAMitchell @hcho3 @yanboliang

RAMitchell · 2019-01-03T23:17:03Z

Native code looks good! Should there be a test here or are the Java tests enough?

hcho3 · 2019-01-04T04:30:11Z

@CodingCat I'm now back from winter vacation. I'll review once #3957 is merged.

CodingCat · 2019-01-06T23:12:38Z

I am testing with our internal dataset while we get 1.5-2X speedup, I found that the training accuracy of fast-histogram is a bit lower than approx,

anyone see the same thing before?

CodingCat · 2019-01-06T23:27:09Z

and when we set col_sample_bytree, fast histogram is even slower than approx, I am investigating if everything is fine in that part

troszok · 2019-01-07T08:06:10Z

I am testing with our internal dataset while we get 1.5-2X speedup, I found that the training accuracy of fast-histogram is a bit lower than approx, anyone see the same thing before?

Hi @CodingCat I would be very happy to do some tests on our datasets - is it possible to grab the packages for that PR from somewhere? Or should I recompile everything on my own?

CodingCat · 2019-01-07T16:40:30Z

Thanks, @troszok

You can fetch the version with the distributed fast histo support with the approach in https://xgboost.readthedocs.io/en/latest/jvm/xgboost4j_spark_tutorial.html#refer-to-xgboost4j-spark-dependency (search XGBoost4J-Spark Snapshot Repo )

the version number is 0.82-SNAPSHOT

jvm-packages/xgboost4j-spark/src/main/scala/ml/dmlc/xgboost4j/scala/spark/XGBoost.scala

CodingCat · 2019-01-09T19:58:31Z

@troszok any update about your test for accuracy?

troszok · 2019-01-10T16:32:05Z

@troszok any update about your test for accuracy?
Hi @CodingCat,
i just updated the code to fetch the 0.82-snapshot and it seems to work. I will do some testing over the next couple of days. I will let you know.

CodingCat · 2019-01-19T21:17:00Z

@hcho3 ping for review?

thvasilo · 2019-01-23T12:14:28Z

src/common/random.h

-    std::sort(new_features.begin(), new_features.end());
-
+    new_features.resize(static_cast<unsigned long>(n));
+    // std::sort(new_features.begin(), new_features.end());


A bit confused by the old code here, was there any reason to sort just after we had shuffled the features?

Then could you explain why we need the ser/deser compared to what was happening before?

src/tree/updater_quantile_hist.cc

thvasilo · 2019-01-23T16:36:45Z

@Liuhaoge this is a discussion about a new upcoming feature in the codebase, asking questions here will not get you anywhere and is diverting from the main topic.

Once the feature has been merged I'd recommend asking questions about usage on the discussion board, we try to use Github for bug reporting and development discussions.

trivialfis · 2019-01-23T17:00:18Z

@thvasilo Actually it might be reasonable to include some documentations about added features in this PR. :) Our documentation is not very thorough.

thvasilo · 2019-01-23T17:13:08Z

@trivialfis Good point, we had that as a requirement for merging PR's that introduce new features in other code-bases I've worked on.

Liuhaoge · 2019-01-24T08:20:38Z

@CodingCat Did you establish an interface to use Monotonic Constrains in Scala? How to use that in distributed environment?

thvasilo · 2019-01-24T12:58:52Z

Hello @CodingCat, I tried running an experiment creating a local cluster with 6 workers today.

Using the approx tree method works as expected, but when I try running the same using the hist method no training occurs.

My configuration file (higgs.conf) looks like this:

data=/path/to/data
num_rounds=5
tree_method=hist # Changing this to approx works fine
verbosity=2
eval_train=1

And I use this to run the command:
./dmlc-core/tracker/dmlc-submit --cluster=local --num-workers=6 xgboost higgs.conf

Looking at the output seems like max_depth is set to 0, even when I explicitly set it in the configuration file:

[13:54:44] INFO: /home/tvas/xgboost-origin/src/learner.cc:215: Tree method is selected to be 'hist', which uses a single updater grow_quantile_histmaker.
[13:54:44] INFO: /home/tvas/xgboost-origin/src/cli_main.cc:198: Loading data: 6.30702 sec
[13:54:44] INFO: /home/tvas/xgboost-origin/src/cli_main.cc:205: boosting round 0, 2.38419e-07 sec elapsed
[13:54:49] INFO: /home/tvas/xgboost-origin/src/tree/updater_quantile_hist.cc:64: Generating gmat: 4.83453 sec
[13:54:50] INFO: /home/tvas/xgboost-origin/src/tree/updater_prune.cc:74: tree pruning end, 1 roots, 0 extra nodes, 0 pruned nodes, max_depth=0
[13:54:50] INFO: /home/tvas/xgboost-origin/src/tree/updater_quantile_hist.cc:216: 
InitData:          0.0142 ( 4.00%)
InitNewNode:       0.0000 ( 0.00%)
BuildHist:         0.2409 (67.94%)
EvaluateSplit:     0.0182 ( 5.15%)
ApplySplit:        0.0000 ( 0.00%)
========================================
Total:             0.3546
[13:54:50] INFO: /home/tvas/xgboost-origin/src/cli_main.cc:205: boosting round 1, 5.67189 sec elapsed
2019-01-24 13:54:50,530 INFO [13:54:50] [0]     train-rmse:0.500000

Edit: I guess this is related to #4078 , maybe I missed something there.

CodingCat · 2019-01-30T17:51:08Z

rebased for running with fixed travis

CodingCat · 2019-01-30T22:51:02Z

@trivialfis doc is updated

hcho3 · 2019-01-31T01:12:07Z

@RAMitchell @trivialfis I'm seeing a memory error in the GPU test. Any idea why?

tests/python-gpu/test_gpu_linear.py::TestGPULinear::test_gpu_coordinate Training on dataset: Boston
Using parameters: {'n_gpus': -1, 'eval_metric': 'rmse', 'objective': 'reg:linear', 'nthread': 2, 'coordinate_selection': 'cyclic', 'eta': 0.5, 'updater': 'coord_descent', 'top_k': 10, 'alpha': 0.005, 'lambda': 0.005, 'tolerance': 1e-05, 'booster': 'gblinear'}
Training on dataset: Digits
Using parameters: {'n_gpus': -1, 'num_class': 10, 'eval_metric': 'merror', 'objective': 'multi:softmax', 'nthread': 2, 'coordinate_selection': 'cyclic', 'eta': 0.5, 'updater': 'coord_descent', 'top_k': 10, 'alpha': 0.005, 'lambda': 0.005, 'tolerance': 1e-05, 'booster': 'gblinear'}

terminate called after throwing an instance of 'dmlc::Error'
  what():  [23:09:37] /workspace/include/xgboost/./../../src/common/common.h:41: /workspace/src/common/host_device_vector.cu: 140: an illegal memory access was encountered

Stack trace returned 10 entries:
[bt] (0) /home/ubuntu/.local/lib/python2.7/site-packages/xgboost-0.81-py2.7.egg/xgboost/./lib/libxgboost.so(dmlc::StackTrace(unsigned long)+0x47) [0x7f8a3d612717]
[bt] (1) /home/ubuntu/.local/lib/python2.7/site-packages/xgboost-0.81-py2.7.egg/xgboost/./lib/libxgboost.so(dmlc::LogMessageFatal::~LogMessageFatal()+0x1d) [0x7f8a3d612b7d]
[bt] (2) /home/ubuntu/.local/lib/python2.7/site-packages/xgboost-0.81-py2.7.egg/xgboost/./lib/libxgboost.so(dh::ThrowOnCudaError(cudaError, char const*, int)+0x123) [0x7f8a3d7df8b3]
[bt] (3) /home/ubuntu/.local/lib/python2.7/site-packages/xgboost-0.81-py2.7.egg/xgboost/./lib/libxgboost.so(xgboost::HostDeviceVectorImpl<int>::DeviceShard::LazySyncDevice(xgboost::GPUAccess)+0x153) [0x7f8a3d840713]
[bt] (4) /home/ubuntu/.local/lib/python2.7/site-packages/xgboost-0.81-py2.7.egg/xgboost/./lib/libxgboost.so(xgboost::HostDeviceVectorImpl<int>::LazySyncDevice(int, xgboost::GPUAccess)+0xd3) [0x7f8a3d840e53]
[bt] (5) /home/ubuntu/.local/lib/python2.7/site-packages/xgboost-0.81-py2.7.egg/xgboost/./lib/libxgboost.so(xgboost::HostDeviceVectorImpl<int>::DeviceSpan(int)+0x5f) [0x7f8a3d840fcf]
[bt] (6) /home/ubuntu/.local/lib/python2.7/site-packages/xgboost-0.81-py2.7.egg/xgboost/./lib/libxgboost.so(xgboost::HostDeviceVector<int>::DeviceSpan(int)+0xc) [0x7f8a3d8411ac]
[bt] (7) /home/ubuntu/.local/lib/python2.7/site-packages/xgboost-0.81-py2.7.egg/xgboost/./lib/libxgboost.so(+0x30ee27) [0x7f8a3d7f1e27]
[bt] (8) /home/ubuntu/.local/lib/python2.7/site-packages/xgboost-0.81-py2.7.egg/xgboost/./lib/libxgboost.so(xgboost::obj::SoftmaxMultiClassObj::GetGradient(xgboost::HostDeviceVector<float> const&, xgboost::MetaInfo const&, int, xgboost::HostDeviceVector<xgboost::detail::GradientPairInternal<float> >*)+0x849) [0x7f8a3d7f51c9]
[bt] (9) /home/ubuntu/.local/lib/python2.7/site-packages/xgboost-0.81-py2.7.egg/xgboost/./lib/libxgboost.so(xgboost::LearnerImpl::UpdateOneIter(int, xgboost::DMatrix*)+0x372) [0x7f8a3d6eb052]

CodingCat · 2019-01-31T15:40:08Z

@hcho3 you have time to review?

There is a following PR based on this to separate depthwidth and lossguide

hcho3 · 2019-01-31T19:51:54Z

@CodingCat Okay, I'll take a quick look today

CodingCat · 2019-02-04T18:11:22Z

@hcho3 any update?

hcho3

@CodingCat LGTM, as far as I'm aware of. I really appreciate your work for updating the fast hist algorithm.

hcho3 · 2019-02-05T02:10:39Z

src/tree/updater_quantile_hist.h

@@ -105,6 +105,7 @@ class QuantileHistMaker: public TreeUpdater {
      } else {
        hist_builder_.BuildHist(gpair, row_indices, gmat, hist);
      }
+      this->histred_.Allreduce(hist.begin, hist_builder_.GetNumBins());


I'm pleasantly surprised that distributed implementation is this succinct and concise.

yes, the all reduce interface is easy to use and we only need to make some additional care to substract tricks in distributed mode

CodingCat changed the title ~~Dist fast histogram~~ Distributed Fast Histogram Algorithm Dec 19, 2018

CodingCat commented Dec 19, 2018

View reviewed changes

...ages/xgboost4j-spark/src/main/scala/ml/dmlc/xgboost4j/scala/spark/params/BoosterParams.scala Show resolved Hide resolved

RAMitchell reviewed Dec 20, 2018

View reviewed changes

src/tree/updater_quantile_hist.cc Show resolved Hide resolved

src/tree/updater_histmaker.cc Outdated Show resolved Hide resolved

src/tree/param.h Outdated Show resolved Hide resolved

src/common/hist_util.h Show resolved Hide resolved

CodingCat mentioned this pull request Jan 5, 2019

[jvm-packages] xgboost4j-spark XGBoostClassifier cannot assign 0 to max_depth #4038

Closed

CodingCat force-pushed the dist_fast_histogram branch 2 times, most recently from aee9bfc to 5fc66db Compare January 6, 2019 23:43

CodingCat mentioned this pull request Jan 7, 2019

colsample_bytree does not take full effect in fast-histo #4047

Closed

CodingCat force-pushed the dist_fast_histogram branch from 5fc66db to 853a758 Compare January 9, 2019 06:16

troszok reviewed Jan 9, 2019

View reviewed changes

jvm-packages/xgboost4j-spark/src/main/scala/ml/dmlc/xgboost4j/scala/spark/XGBoost.scala Show resolved Hide resolved

hcho3 mentioned this pull request Jan 18, 2019

Can we add monotonic constraint in updater_histmaker? #4063

Closed

CodingCat mentioned this pull request Jan 20, 2019

[WIP]fix col_sampling in fast histogram algorithm #4070

Closed

5 tasks

trivialfis mentioned this pull request Jan 21, 2019

How to modify the code in updater_histmaker to release monotonic constraint? #4074

Closed

CodingCat force-pushed the dist_fast_histogram branch from 853a758 to f326157 Compare January 22, 2019 17:05

thvasilo reviewed Jan 23, 2019

View reviewed changes

Nan Zhu added 15 commits January 30, 2019 09:50

sync per node stats

7a18c5f

temp

cb372af

update

813cec1

final

57679e6

code clean

313d66a

update rabit

b5b77ec

more cleanup

d9188b4

fix errors

d96cdf5

fix failed tests

5297068

enforce c++11

dc1642e

fix lint issue

e6a79f4

broadcast subsampled feature correctly

4732e66

revert some changes

556aa3d

fix lint issue

63760f9

enable monotone and interaction constraints

d3b312a

CodingCat force-pushed the dist_fast_histogram branch from d42ac2c to d3b312a Compare January 30, 2019 17:50

Nan Zhu added 2 commits January 30, 2019 12:55

don't specify default for monotone and interactions

cd5d649

update docs

50a20c5

hcho3 approved these changes Feb 5, 2019

View reviewed changes

hcho3 reviewed Feb 5, 2019

View reviewed changes

CodingCat merged commit ae3bb9c into dmlc:master Feb 5, 2019

CodingCat mentioned this pull request Feb 5, 2019

Separate Depthwidth and Lossguide growing policy in fast histogram #4102

Merged

hcho3 mentioned this pull request Mar 4, 2019

[RFC] Version 0.82 release candidate #4201

Merged

lock bot locked as resolved and limited conversation to collaborators May 6, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Distributed Fast Histogram Algorithm #4011

Distributed Fast Histogram Algorithm #4011

CodingCat commented Dec 19, 2018

codecov-io commented Dec 30, 2018 •

edited

Loading

CodingCat commented Jan 3, 2019

RAMitchell commented Jan 3, 2019

hcho3 commented Jan 4, 2019

CodingCat commented Jan 6, 2019

CodingCat commented Jan 6, 2019

troszok commented Jan 7, 2019 •

edited

Loading

CodingCat commented Jan 7, 2019

CodingCat commented Jan 9, 2019

troszok commented Jan 10, 2019

CodingCat commented Jan 19, 2019

thvasilo Jan 23, 2019

thvasilo commented Jan 23, 2019

trivialfis commented Jan 23, 2019

thvasilo commented Jan 23, 2019

Liuhaoge commented Jan 24, 2019

thvasilo commented Jan 24, 2019 •

edited

Loading

CodingCat commented Jan 30, 2019

CodingCat commented Jan 30, 2019

hcho3 commented Jan 31, 2019

CodingCat commented Jan 31, 2019 •

edited

Loading

hcho3 commented Jan 31, 2019

CodingCat commented Feb 4, 2019

hcho3 left a comment

hcho3 Feb 5, 2019

CodingCat Feb 5, 2019

Distributed Fast Histogram Algorithm #4011

Distributed Fast Histogram Algorithm #4011

Conversation

CodingCat commented Dec 19, 2018

codecov-io commented Dec 30, 2018 • edited Loading

Codecov Report

CodingCat commented Jan 3, 2019

RAMitchell commented Jan 3, 2019

hcho3 commented Jan 4, 2019

CodingCat commented Jan 6, 2019

CodingCat commented Jan 6, 2019

troszok commented Jan 7, 2019 • edited Loading

CodingCat commented Jan 7, 2019

CodingCat commented Jan 9, 2019

troszok commented Jan 10, 2019

CodingCat commented Jan 19, 2019

thvasilo Jan 23, 2019

Choose a reason for hiding this comment

thvasilo commented Jan 23, 2019

trivialfis commented Jan 23, 2019

thvasilo commented Jan 23, 2019

Liuhaoge commented Jan 24, 2019

thvasilo commented Jan 24, 2019 • edited Loading

CodingCat commented Jan 30, 2019

CodingCat commented Jan 30, 2019

hcho3 commented Jan 31, 2019

CodingCat commented Jan 31, 2019 • edited Loading

hcho3 commented Jan 31, 2019

CodingCat commented Feb 4, 2019

hcho3 left a comment

Choose a reason for hiding this comment

hcho3 Feb 5, 2019

Choose a reason for hiding this comment

CodingCat Feb 5, 2019

Choose a reason for hiding this comment

codecov-io commented Dec 30, 2018 •

edited

Loading

troszok commented Jan 7, 2019 •

edited

Loading

thvasilo commented Jan 24, 2019 •

edited

Loading

CodingCat commented Jan 31, 2019 •

edited

Loading