Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP][Proposal] PARQUET-2430: Add parquet joiner #1273

Open
wants to merge 24 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 20 commits
Commits
Show all changes
24 commits
Select commit Hold shift + click to select a range
f5144b2
add initial ParquetJoiner implementation
Jan 28, 2024
01a08dd
add initial ParquetJoiner implementation
Feb 1, 2024
28c987c
Merge remote-tracking branch 'origin/master' into add-parquet-joiner
Feb 12, 2024
7ae3505
refactor ParquetJoiner implementation
Feb 17, 2024
05eb22a
extend the main test for multiple files on the right
Feb 20, 2024
6bb950d
extend the main test for multiple files on the right
Feb 22, 2024
87b923c
Merge branch 'master' into add-parquet-joiner
Feb 22, 2024
f9536c3
converge join logic, crate a draft of options and rewriter
Feb 23, 2024
d7f11d9
move ParquetJoinTest logic to ParquetRewriterTest
Feb 27, 2024
e8e7ffe
improve Parquet stitching test
Mar 1, 2024
3ee946c
remove custom ParquetRewriter constructor
Mar 6, 2024
fd409c4
remove custom ParquetRewriter constructor
Mar 6, 2024
5a98219
refactor ParquetRewriter
Mar 12, 2024
7b2fd1a
apply spotless and address PR comments
Mar 14, 2024
8da8291
move extra column writing into processBlocksFromReader
Mar 15, 2024
68e41ba
add getInputFiles back
Mar 16, 2024
98b9b23
Merge remote-tracking branch 'fork/master' into add-parquet-joiner
Mar 16, 2024
6d2c222
fix extra ParquetRewriter constructor so tests can pass
Mar 16, 2024
883e935
remove not needed TODOs
Mar 20, 2024
8ef36b5
address PR comments
Mar 24, 2024
79cc2b8
Merge remote-tracking branch 'origin/master' into add-parquet-joiner
Apr 11, 2024
0bbf72f
rename inputFilesR to inputFilesToJoin
Apr 11, 2024
ca53bff
rename inputFilesR to inputFilesToJoinColumns
Apr 11, 2024
1e7998a
add getParquetInputFiles listing to the rewrite start logging
Apr 11, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Loading