Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GLUTEN-3722][CH] Improve shuffle writer #3728

Merged
merged 11 commits into from
Nov 30, 2023
Merged

Conversation

taiyang-li
Copy link
Contributor

@taiyang-li taiyang-li commented Nov 15, 2023

What changes were proposed in this pull request?

(Please fill in changes proposed in this fix)

(Fixes: #3722)

Changes

  • Improve style of some codes in shuffle writer.
  • Make sure that columns buffer do not exceeds split size in shuffle writer
  • Remove all the locks in shuffle writer for write/spill/stop are executed in single jvm thread.
  • Protect memory spill from being called recursively.

Copy link

#3722

Copy link

Run Gluten Clickhouse CI

1 similar comment
Copy link

Run Gluten Clickhouse CI

@taiyang-li taiyang-li marked this pull request as draft November 16, 2023 06:06
Copy link

Run Gluten Clickhouse CI

1 similar comment
Copy link

Run Gluten Clickhouse CI

Copy link

Run Gluten Clickhouse CI

3 similar comments
Copy link

Run Gluten Clickhouse CI

Copy link

Run Gluten Clickhouse CI

Copy link

Run Gluten Clickhouse CI

Copy link

Run Gluten Clickhouse CI

@taiyang-li taiyang-li changed the title [GLUTEN-3722][CH] Improve some codes based on pr: 3279 [GLUTEN-3722][CH] Improve mutex usage in partition writer Nov 16, 2023
@taiyang-li taiyang-li marked this pull request as ready for review November 16, 2023 10:07
Copy link

Run Gluten Clickhouse CI

Copy link

Run Gluten Clickhouse CI

Copy link

Run Gluten Clickhouse CI

@taiyang-li taiyang-li changed the title [GLUTEN-3722][CH] Improve mutex usage in partition writer [GLUTEN-3722][CH] Improve code style of shuffle writer Nov 23, 2023
Copy link

Run Gluten Clickhouse CI

5 similar comments
Copy link

Run Gluten Clickhouse CI

Copy link

Run Gluten Clickhouse CI

Copy link

Run Gluten Clickhouse CI

Copy link

Run Gluten Clickhouse CI

Copy link

Run Gluten Clickhouse CI

@taiyang-li taiyang-li changed the title [GLUTEN-3722][CH] Improve code style of shuffle writer [GLUTEN-3722][CH] Improve shuffle writer Nov 23, 2023
@taiyang-li taiyang-li marked this pull request as ready for review November 27, 2023 10:40
Copy link

Run Gluten Clickhouse CI

1 similar comment
Copy link

Run Gluten Clickhouse CI

Copy link

Run Gluten Clickhouse CI

@baibaichen
Copy link
Contributor

baibaichen commented Nov 29, 2023

@taiyang-li 看起来改动不少,能不能描述下设计?

Copy link

Run Gluten Clickhouse CI

Copy link

Run Gluten Clickhouse CI

Copy link
Contributor

@liuneng1994 liuneng1994 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@liuneng1994 liuneng1994 merged commit 616029e into apache:main Nov 30, 2023
17 checks passed
@GlutenPerfBot
Copy link
Contributor

===== Performance report for TPCH SF2000 with Velox backend, for reference only ====

query log/native_3728_time.csv log/native_master_11_29_2023_0ac1dc42d_time.csv difference percentage
q1 34.32 34.61 0.292 100.85%
q2 24.90 24.87 -0.028 99.89%
q3 37.98 36.61 -1.372 96.39%
q4 37.47 36.91 -0.564 98.50%
q5 72.38 71.43 -0.949 98.69%
q6 7.10 7.22 0.125 101.75%
q7 85.01 85.50 0.490 100.58%
q8 86.24 88.07 1.828 102.12%
q9 124.53 127.33 2.801 102.25%
q10 43.33 44.31 0.989 102.28%
q11 20.44 20.45 0.007 100.03%
q12 25.91 27.02 1.114 104.30%
q13 46.36 46.05 -0.308 99.34%
q14 18.57 18.98 0.404 102.18%
q15 27.95 29.21 1.258 104.50%
q16 15.31 15.65 0.348 102.27%
q17 102.53 101.75 -0.787 99.23%
q18 151.02 150.75 -0.272 99.82%
q19 12.82 13.94 1.123 108.76%
q20 28.02 30.32 2.305 108.23%
q21 225.61 225.21 -0.404 99.82%
q22 13.15 13.20 0.056 100.42%
total 1240.93 1249.39 8.457 100.68%

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[CH] improve mutex usage in shuffle writer
5 participants