Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix metadata after implicit array conversion from Dask cuDF #16842

Merged
merged 11 commits into from
Sep 25, 2024

Conversation

rjzamora
Copy link
Member

Description

Temporary workaround for dask/dask#11017 in Dask cuDF (when query-planning is enabled).
I will try to move this fix upstream soon. However, the next dask release will probably not be used by 24.10, and it's still unclear whether the same fix works for all CPU cases.

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

@rjzamora rjzamora added bug Something isn't working 2 - In Progress Currently a work in progress non-breaking Non-breaking change labels Sep 19, 2024
@rjzamora rjzamora self-assigned this Sep 19, 2024
@github-actions github-actions bot added the Python Affects Python cuDF API. label Sep 19, 2024
@rjzamora rjzamora added 3 - Ready for Review Ready for review by team and removed 2 - In Progress Currently a work in progress labels Sep 19, 2024
@rjzamora rjzamora marked this pull request as ready for review September 19, 2024 18:07
@rjzamora rjzamora requested a review from a team as a code owner September 19, 2024 18:07
Copy link
Contributor

@wence- wence- left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this makes sense

Comment on lines 245 to 247
@get_collection_type.register(cupy.ndarray)
def get_collection_type_cupy_array(_):
return _create_array_collection_with_meta
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
@get_collection_type.register(cupy.ndarray)
def get_collection_type_cupy_array(_):
return _create_array_collection_with_meta
get_collection_type.register(cupy.ndarray)(_create_array_collection_with_meta)

?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't quite work. I can change to the following if you don't like the decorator pattern for this case:

    get_collection_type.register(
        cupy.ndarray,
        lambda x: _create_array_collection_with_meta,
    )

The "trick" is that we need to register a function that simply returns a function that accepts an Expr argument.

Comment on lines 255 to 257
@get_collection_type.register(spmatrix)
def get_collection_type_csr_matrix(_):
return _create_array_collection_with_meta
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the suggestion above works, the same thing here?

@rjzamora rjzamora added 5 - Ready to Merge Testing and reviews complete, ready to merge and removed 3 - Ready for Review Ready for review by team labels Sep 23, 2024
@rjzamora
Copy link
Member Author

/merge

@rapids-bot rapids-bot bot merged commit 22cefc9 into rapidsai:branch-24.10 Sep 25, 2024
96 checks passed
@rjzamora rjzamora deleted the bug/dask-array-roundtrip branch September 25, 2024 14:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
5 - Ready to Merge Testing and reviews complete, ready to merge bug Something isn't working non-breaking Non-breaking change Python Affects Python cuDF API.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants