Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add row index and stripe size options to Python ORC chunked writer #14785

Merged

Conversation

vuule
Copy link
Contributor

@vuule vuule commented Jan 18, 2024

Description

Adds the APIs that control the stripe/row group size when using the chunked writer. This functions are already present in to_orc (non-chunked version of the same API).

Adding this options to facilitate smaller unit tests.

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

@github-actions github-actions bot added the Python Affects Python cuDF API. label Jan 18, 2024
@vuule vuule added feature request New feature or request non-breaking Non-breaking change labels Jan 18, 2024
@vuule vuule changed the title Fea py chunker orc write stripe Add row index and stripe size options to Python ORC chunked writer Jan 18, 2024
@vuule vuule added the cuIO cuIO issue label Jan 18, 2024
@vuule vuule marked this pull request as ready for review January 22, 2024 19:54
@vuule vuule requested a review from a team as a code owner January 22, 2024 19:54
@vuule vuule requested review from shwina and isVoid January 22, 2024 19:54
@vuule
Copy link
Contributor Author

vuule commented Jan 22, 2024

The docs are not checked because the entire chunked writer is excluded from the python docs. Opened #14828 for this.

@vuule
Copy link
Contributor Author

vuule commented Jan 24, 2024

/merge

@rapids-bot rapids-bot bot merged commit 258d9ee into rapidsai:branch-24.04 Jan 24, 2024
67 checks passed
@vuule vuule deleted the fea-py-chunker-orc-write-stripe branch January 24, 2024 23:13
PointKernel pushed a commit to PointKernel/cudf that referenced this pull request Jan 25, 2024
…apidsai#14785)

Adds the APIs that control the stripe/row group size when using the chunked writer. This functions are already present in to_orc (non-chunked version of the same API).

Adding this options to facilitate smaller unit tests.

Authors:
  - Vukasin Milovanovic (https://github.com/vuule)

Approvers:
  - Bradley Dice (https://github.com/bdice)

URL: rapidsai#14785
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cuIO cuIO issue feature request New feature or request non-breaking Non-breaking change Python Affects Python cuDF API.
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

2 participants