[tests][dask] Add voting_parallel algorithm in tests (fixes #3834) #4088

jmoralez · 2021-03-20T01:52:44Z

This includes the voting_parallel tree_learner for test_regressor, test_classifier and test_ranker in the tests for the dask module and removes the warning about experimental support that was previously triggered (because it wasn't tested).

…er and test_ranker

jameslamb

Thanks for doing this! I'm glad it was as easy as just changing the tests. I have a few suggested changes for the tests.

tests/python_package_test/test_dask.py

…use the error message in the test for error in feature parallel

jmoralez · 2021-03-23T03:37:22Z

Hi, James. Do you have any suggestions on what to work on next?

jmoralez · 2021-03-23T03:50:06Z

I noticed a failing test due to graphviz I believe. This time it failed in Linux_latest_gpu_source. Here are the errors:

Warning: Could not load "/home/AzDevOps_azpcontainer/miniconda/envs/test-env/lib/graphviz/libgvplugin_pango.so.6" - It was found, so perhaps one of its dependents was not.  Try ldd.
Warning: Could not load "/home/AzDevOps_azpcontainer/miniconda/envs/test-env/lib/graphviz/libgvplugin_gd.so.6" - It was found, so perhaps one of its dependents was not.  Try ldd.

I believe this test was also failing on some builds before I merged the latest master.

The other one is related to #4095 (comment).

jameslamb · 2021-03-23T03:52:19Z

Hi, James. Do you have any suggestions on what to work on next?

are you mostly interested in Dask? If so, this issue would be a good one to pick up next: #3896. If not, let me know and I can recommend something else.

jameslamb

Thanks for making that most recent round of changes. I don't have any other suggestions. Could you please update to the latest master?

StrikerRUS

Great to know that voting parallel is working without any modifications!
Please consider checking a few my minor comments below:

tests/python_package_test/test_dask.py

…put data from feature_parallel error

jmoralez · 2021-03-31T02:33:17Z

@jameslamb do you know what's the advantage of using the client fixture? (Apart from avoiding the client construction). I just tried running test_regressor by initializing the client at the top and the time goes from 137s down to 98s.
Here's the profile with the fixture:

And without:

So basically with the fixture 70% of the time is spent in setting up and tearing down the cluster. Without it only 60% of the time is spent there and seems to be faster.

Followup
I've been investigating more and found dask/distributed#3540, which led me to the implementation by seanlaw, here: https://github.com/TDAmeritrade/stumpy/blob/68092c931610db725f2c74b6a5155868666eb14f/tests/test_mstumped.py#L10-L14 and holy cow that's fast. Using that test_regressor runs in 18 seconds. Would you support a PR adding this? It'd basically be replacing the client fixture with the dask_cluster fixture and instantiating a client from it in every test. When the client gets closed it's memory gets released so I believe this is the same as what we have right now, every test gets a fresh client but it's significantly faster because we're not creating a cluster everytime.

StrikerRUS · 2021-03-31T11:51:26Z

@jmoralez

do you know what's the advantage of using the client fixture?

Awesome research! Could you please copy-paste your comment here: #3829 (comment). I believe that place is the best one to continue the discussion.

StrikerRUS

LGTM!

jameslamb

thanks for this!

github-actions · 2023-08-23T22:51:19Z

This pull request has been automatically locked since there has not been any recent activity since it was closed. To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues including a reference to this.

include voting_parallel tree_learner in test_regressor, test_classifi…

1815194

…er and test_ranker

jmoralez requested a review from jameslamb as a code owner March 20, 2021 01:52

jameslamb added the feature label Mar 20, 2021

remove test for warnings and test for error when using feature_parallel

2a4ba07

jameslamb requested changes Mar 21, 2021

View reviewed changes

tests/python_package_test/test_dask.py Outdated Show resolved Hide resolved

tests/python_package_test/test_dask.py Outdated Show resolved Hide resolved

jmoralez added 3 commits March 22, 2021 09:43

use real names for tree_learner intest and include test for aliases. …

76b9994

…use the error message in the test for error in feature parallel

merge master

1e32675

merge master to fix ci

a40d266

jameslamb self-requested a review March 28, 2021 04:21

jameslamb requested changes Mar 28, 2021

View reviewed changes

jmoralez added 3 commits March 29, 2021 19:28

merge master

8d123eb

solve conflicts

605c188

split all tests with rf in test_classifier

0c2e91e

StrikerRUS reviewed Mar 30, 2021

View reviewed changes

tests/python_package_test/test_dask.py Show resolved Hide resolved

tests/python_package_test/test_dask.py Outdated Show resolved Hide resolved

tests/python_package_test/test_dask.py Outdated Show resolved Hide resolved

remove task parametrization for tree_learner aliases test. smaller in…

aa78ea3

…put data from feature_parallel error

define task for tree_learner aliases

b6e2e47

StrikerRUS approved these changes Mar 31, 2021

View reviewed changes

StrikerRUS requested a review from jameslamb March 31, 2021 11:53

trivialfis mentioned this pull request Mar 31, 2021

Speed up dask tests. dmlc/xgboost#6816

Closed

jmoralez mentioned this pull request Mar 31, 2021

[dask] Random failures in Dask tests during teardown #3829

Closed

jameslamb approved these changes Apr 1, 2021

View reviewed changes

jameslamb merged commit d517ba1 into microsoft:master Apr 1, 2021

StrikerRUS mentioned this pull request Apr 2, 2021

[tests][dask] Increase number of partitions in data #4149

Closed

jmoralez deleted the test-voting_parallel branch April 3, 2021 03:54

jmoralez mentioned this pull request Apr 6, 2021

[tests][dask] replace client fixture with cluster fixture #4159

Merged

jameslamb mentioned this pull request Apr 12, 2021

[python] added f-strings to python-package/lightgbm/dask.py #4144

Merged

github-actions bot locked as resolved and limited conversation to collaborators Aug 23, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[tests][dask] Add voting_parallel algorithm in tests (fixes #3834) #4088

[tests][dask] Add voting_parallel algorithm in tests (fixes #3834) #4088

jmoralez commented Mar 20, 2021 •

edited

Loading

jameslamb left a comment

jmoralez commented Mar 23, 2021

jmoralez commented Mar 23, 2021

jameslamb commented Mar 23, 2021

jameslamb left a comment

StrikerRUS left a comment

jmoralez commented Mar 31, 2021 •

edited

Loading

StrikerRUS commented Mar 31, 2021

StrikerRUS left a comment

jameslamb left a comment

github-actions bot commented Aug 23, 2023

[tests][dask] Add voting_parallel algorithm in tests (fixes #3834) #4088

[tests][dask] Add voting_parallel algorithm in tests (fixes #3834) #4088

Conversation

jmoralez commented Mar 20, 2021 • edited Loading

jameslamb left a comment

Choose a reason for hiding this comment

jmoralez commented Mar 23, 2021

jmoralez commented Mar 23, 2021

jameslamb commented Mar 23, 2021

jameslamb left a comment

Choose a reason for hiding this comment

StrikerRUS left a comment

Choose a reason for hiding this comment

jmoralez commented Mar 31, 2021 • edited Loading

StrikerRUS commented Mar 31, 2021

StrikerRUS left a comment

Choose a reason for hiding this comment

jameslamb left a comment

Choose a reason for hiding this comment

github-actions bot commented Aug 23, 2023

jmoralez commented Mar 20, 2021 •

edited

Loading

jmoralez commented Mar 31, 2021 •

edited

Loading