Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GLUTEN-7024][VL] Skip call collectMetrics when the task does not call next() #7025

Merged
merged 2 commits into from
Aug 29, 2024

Conversation

kecookier
Copy link
Contributor

@kecookier kecookier commented Aug 27, 2024

What changes were proposed in this pull request?

The issue is due to one side of join is an empty relation which cause Velox does not actually execute join operator. We collect metrics when each task completion, so one task collect metrics would not affect another.

(Fixes: #7024)

How was this patch tested?

(Please explain how this patch was tested. E.g. unit tests, integration tests, manual tests)

(If this patch involves UI changes, please attach a screenshot; otherwise, remove this)

@github-actions github-actions bot added the VELOX label Aug 27, 2024
Copy link

Thanks for opening a pull request!

Could you open an issue for this pull request on Github Issues?

https://github.com/apache/incubator-gluten/issues

Then could you also rename commit message and pull request title in the following format?

[GLUTEN-${ISSUES_ID}][COMPONENT]feat/fix: ${detailed message}

See also:

@kecookier kecookier changed the title [VL] Fix collectMetrics error when multi gluten rdd exist in one stage [GLUTEN-7024][VL] Fix collectMetrics error when multi gluten rdd exist in one stage Aug 27, 2024
Copy link

#7024

@kecookier kecookier requested a review from ulysses-you August 29, 2024 02:09
@ulysses-you
Copy link
Contributor

I think the issue is due to one side of join is an empty relation which cause Velox does not actually execute join operator. The pr description is confused.. we collect metrics when each task completion, so one task collect metrics would not affect another.

ulysses-you
ulysses-you previously approved these changes Aug 29, 2024
Copy link
Contributor

@ulysses-you ulysses-you left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the fix lgtm, thank you @kecookier

@@ -301,16 +301,22 @@ void WholeStageResultIterator::collectMetrics() {
return;
}

const auto& taskStats = task_->taskStats();
if (taskStats.executionStartTimeMs == 0) {
LOG(INFO) << "collectMetrics failed, taskStats is zero, maybe task never call next().";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Skip collect task metrics since task did not call next()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ulysses-you Can you review again?

@kecookier
Copy link
Contributor Author

I think the issue is due to one side of join is an empty relation which cause Velox does not actually execute join operator. The pr description is confused.. we collect metrics when each task completion, so one task collect metrics would not affect another.

Yes, you are right.

@kecookier kecookier changed the title [GLUTEN-7024][VL] Fix collectMetrics error when multi gluten rdd exist in one stage [GLUTEN-7024][VL] Skip call collectMetrics when the task does not call next() Aug 29, 2024
@kecookier kecookier merged commit 79bb99a into apache:main Aug 29, 2024
45 checks passed
sharkdtu pushed a commit to sharkdtu/gluten that referenced this pull request Nov 11, 2024
weiting-chen pushed a commit to weiting-chen/gluten that referenced this pull request Nov 18, 2024
weiting-chen added a commit that referenced this pull request Nov 20, 2024
* [GLUTEN-7024][VL] Skip call collectMetrics when the task does not call next() (#7025)

* [GLUTEN-7130][CORE] Skip command execution when collect qe fallback summary (#7132)

* [VL] Add config for show velox task metrics when finished (#6573)

---------

Co-authored-by: zhaokuo <zhaokuo_game@163.com>
Co-authored-by: Zhen Wang <643348094@qq.com>
Co-authored-by: Yang Zhang <yangchuan.zy@alibaba-inc.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[VL] Exception: Node id cannot be found in plan status in branch-1.2
2 participants