Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce execution time of Parquet C++ tests #14750

Merged
merged 10 commits into from
Jan 17, 2024

Conversation

vuule
Copy link
Contributor

@vuule vuule commented Jan 11, 2024

Description

Reduced time from 90s to 25s on local system. Very few tests are impacted, and there should be no impact on code coverage.

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

@vuule vuule added tests Unit testing for project cuIO cuIO issue improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Jan 11, 2024
@vuule vuule self-assigned this Jan 11, 2024
@github-actions github-actions bot added the libcudf Affects libcudf (C++/CUDA) code. label Jan 11, 2024
@vuule vuule marked this pull request as ready for review January 12, 2024 01:29
@vuule vuule requested a review from a team as a code owner January 12, 2024 01:29
@vuule vuule requested review from vyasr and shrshi January 12, 2024 01:29
@vuule
Copy link
Contributor Author

vuule commented Jan 12, 2024

CC @nvdbaranec @etseidl who wrote or modified many of these tests.

Copy link
Contributor

@etseidl etseidl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good stuff. Amazing how much fat there was to trim 😅

cpp/tests/io/parquet_misc_test.cpp Outdated Show resolved Hide resolved
cpp/tests/io/parquet_v2_test.cpp Outdated Show resolved Hide resolved
cpp/tests/io/parquet_v2_test.cpp Show resolved Hide resolved
cpp/tests/io/parquet_writer_test.cpp Outdated Show resolved Hide resolved
@@ -688,60 +687,9 @@ TEST_P(ParquetV2Test, PartitionedWriteEmptyColumns)
CUDF_TEST_EXPECT_TABLES_EQUAL(expected2, result2.tbl->view());
}

TEST_P(ParquetV2Test, LargeColumnIndex)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you help me understand why this test has been removed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a great question, I should have left a comment before reviews.
This test was included only to test the case where the writer writes the data in two batches (not the same as chunks!). Batching has since been disabled so we don't need this (huge) test.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Understood, thanks!

Co-authored-by: Nghia Truong <7416935+ttnghia@users.noreply.github.com>
@vuule
Copy link
Contributor Author

vuule commented Jan 17, 2024

/merge

@rapids-bot rapids-bot bot merged commit 9acddc0 into rapidsai:branch-24.02 Jan 17, 2024
67 checks passed
@vuule vuule deleted the impr-reduce-pq-cpp-tests branch January 17, 2024 22:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cuIO cuIO issue improvement Improvement / enhancement to an existing function libcudf Affects libcudf (C++/CUDA) code. non-breaking Non-breaking change tests Unit testing for project
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

7 participants