Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Profiler] Fix graph_executor_debug hang #12382

Merged
merged 3 commits into from
Aug 12, 2022

Conversation

echuraev
Copy link
Contributor

For some operations such as __nop or __copy the measured inference
time is equal to 0. In this case we are in infinite loop and we won't
exit from it. Added new parameter max_repeat_num which specify the
maximum number of repeats then the inference time is equal to 0. When
we exceed this value then we will exit from a loop.

cc: @valmat07, @Icemist, @masahi

For some operations such as `__nop` or `__copy` the measured inference
time is equal to 0. In this case we are in infinite loop and we won't
exit from it. Added new parameter `max_repeat_num` which specify the
maximum number of repeats then the inference time is equal to 0. When
we exceed this value then we will exit from a loop.
Copy link
Contributor

@Icemist Icemist left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks useful, a few comments.

СС @tkonolige may be interested in this too.

include/tvm/runtime/profiling.h Outdated Show resolved Hide resolved
src/runtime/crt/common/crt_runtime_api.c Outdated Show resolved Hide resolved
src/runtime/profiling.cc Outdated Show resolved Hide resolved
@tkonolige
Copy link
Contributor

@echuraev thanks for this PR. Its definitely an edge case we need fixed.

@Icemist Thanks for reviewing!

@echuraev echuraev force-pushed the echuraev/fix_profiler_hang branch 4 times, most recently from ef7558c to 08ffe3a Compare August 12, 2022 09:09
Copy link
Contributor

@tkonolige tkonolige left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@echuraev thanks!

@tkonolige tkonolige merged commit c3c7c4c into apache:main Aug 12, 2022
comaniac added a commit to awslabs/raf that referenced this pull request Aug 15, 2022
comaniac added a commit to awslabs/raf that referenced this pull request Aug 15, 2022
* [TVM] Update Submodule

* [Compatible] Fix apache/tvm#12066

* [Compatible] Fix apache/tvm#12382

Co-authored-by: SubmoduleUpdaterBot <submodule-updater-bot@users.noreply.github.com>
Co-authored-by: Cody Yu <comaniac0422@gmail.com>
xinetzone pushed a commit to daobook/tvm that referenced this pull request Nov 25, 2022
For some operations such as `__nop` or `__copy` the measured inference
time is equal to 0. In this case we are in infinite loop and we won't
exit from it. Added new parameter `limit_zero_time_iterations ` which specify the
maximum number of repeats then the inference time is equal to 0. When
we exceed this value then we will exit from a loop.
@echuraev echuraev deleted the echuraev/fix_profiler_hang branch April 14, 2023 10:21
mikeseven pushed a commit to mikeseven/tvm that referenced this pull request Sep 27, 2023
For some operations such as `__nop` or `__copy` the measured inference
time is equal to 0. In this case we are in infinite loop and we won't
exit from it. Added new parameter `limit_zero_time_iterations ` which specify the
maximum number of repeats then the inference time is equal to 0. When
we exceed this value then we will exit from a loop.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants