Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix initialization error in to_arrow for empty string views #16033

Merged

Conversation

wence-
Copy link
Contributor

@wence- wence- commented Jun 14, 2024

Description

When converting an empty string view to arrow, we don't bother with copies from device, but rather create the arrow arrays directly. The offset buffer is therefore a singleton int32 array with zero in it.

Previously, the initialization of this array was incorrect, since mutable_data() returns a uint8_t pointer, and so setting the single element could leave 24 of the 32 bits uninitialized.

Fix this by using memset instead to zero out the full buffer.

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

When converting an empty string view to arrow, we don't bother with
copies from device, but rather create the arrow arrays directly. The
offset buffer is therefore a singleton int32 array with zero in it.

Previously, the initialization of this array was incorrect, since
mutable_data() returns a uint8_t pointer, and so setting the single
element could leave 24 or the 32 bits uninitialized.

Fix this by using memset instead to zero out the full buffer.
@wence- wence- requested a review from a team as a code owner June 14, 2024 15:02
@wence- wence- requested review from mythrocks and ttnghia June 14, 2024 15:02
@github-actions github-actions bot added the libcudf Affects libcudf (C++/CUDA) code. label Jun 14, 2024
@wence- wence- added bug Something isn't working non-breaking Non-breaking change labels Jun 14, 2024
Copy link
Contributor

@davidwendt davidwendt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch.

cpp/src/interop/to_arrow.cu Outdated Show resolved Hide resolved
cpp/src/interop/to_arrow.cu Outdated Show resolved Hide resolved
Co-authored-by: David Wendt <45795991+davidwendt@users.noreply.github.com>
@wence-
Copy link
Contributor Author

wence- commented Jun 14, 2024

/merge

@rapids-bot rapids-bot bot merged commit 2297f9a into rapidsai:branch-24.08 Jun 14, 2024
79 checks passed
@wence- wence- deleted the wence/fix/to-arrow-empty-string-uninit branch June 14, 2024 17:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working libcudf Affects libcudf (C++/CUDA) code. non-breaking Non-breaking change
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

3 participants