-
-
Notifications
You must be signed in to change notification settings - Fork 8.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AttributeError
with fitting model on Dask Array backed by scipy.sparse.csr_matrix
#7454
Comments
Thank you for opening the issue. I will work on some tests for |
I'm encountering the same issue as @jrbourbeau with the following package versions: The example code snippet above returns the same error: "AttributeError: divisions not found" @trivialfis -- were your changes merged into 1.5.1? |
@trivialfis - any update on this? I am still encountering this issue while running |
@rrpelgrim Please update to the latest XGBoost 1.6.1 |
I came across a use case where attempting to fit a
DaskXGBClassifier
on a Dask Array whose partitions arescipy.sparse.csr_matrix
s (as is returned by Dask-ML'sHashingVectorizer
) results in aAttributeError: divisions not found
error (full traceback included below).From doing some initial debugging it appears the underlying issue is that during the fitting process we end up passing a
list
of sparse matrices to Dask'sdd.multi.concat
herexgboost/python-package/xgboost/dask.py
Line 207 in d33854a
However,
dd.multi.concat
expects alist
of Dask DataFrames, which is where theAttributeError: divisions not found
is coming from (Dask DataFrames have a.divisions
attribute whichdd.multi.concat
assumes exists).Here's an example code snippet which should reproduce the issue when using the latest
xgboost
(1.5.0) anddask
(2021.11.2) /distributed
(2021.11.2) releases:Full traceback:
The text was updated successfully, but these errors were encountered: