Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can not generate .json event trace #62

Open
yitian1031 opened this issue Mar 27, 2024 · 5 comments
Open

Can not generate .json event trace #62

yitian1031 opened this issue Mar 27, 2024 · 5 comments

Comments

@yitian1031
Copy link

yitian1031 commented Mar 27, 2024

Running command shown as below:
LD_PRELOAD=/home/yitian/wyt/unitrace1/pti-gpu/tools/unitrace/build/libunitrace_tool.so /home/yitian/wyt/unitrace1/pti-gpu/tools/unitrace/build/unitrace --chrome-sycl-logging --chrome-dnn-logging --chrome-call-logging --chrome-kernel-logging --chrome-device-logging python test.py
And here comes the segment fault:
image
The generated json files contain nothing.

When running command as:
LD_PRELOAD=/home/yitian/wyt/unitrace1/pti-gpu/tools/unitrace/build/libunitrace_tool.so /home/yitian/wyt/unitrace1/pti-gpu/tools/unitrace/build/unitrace -d -s -t --chrome-kernel-logging --chrome-device-logging --chrome-no-thread-on-device --chrome-no-engine-on-device python test.py

Here comes the aborted error:
image
The generated json files contain some logging records.

@Sarbojit2019
Copy link
Contributor

Hello @yitian1031,
Thanks for reporting the issue. I have few questions/suggestions to handle the issue better.

  1. Are you able to run the test.py without unitrace? As per the call stack shared, it looks like application (test.py) error due to bad allocation hence the ask.
  2. Run unitrace with '-c' option to check which API call is crashing. It will help you understand if any particular kernel launch has failed due to application bug.
  3. By default unitrace writes into .json file only at the end of successful run. Since there is crash hence you are seeing empty file.

@yitian1031
Copy link
Author

yitian1031 commented Apr 17, 2024

Hello @yitian1031, Thanks for reporting the issue. I have few questions/suggestions to handle the issue better.

  1. Are you able to run the test.py without unitrace? As per the call stack shared, it looks like application (test.py) error due to bad allocation hence the ask.
  2. Run unitrace with '-c' option to check which API call is crashing. It will help you understand if any particular kernel launch has failed due to application bug.
  3. By default unitrace writes into .json file only at the end of successful run. Since there is crash hence you are seeing empty file.

The test.py can successfully run without unitrace;
image

Following your suggestion, I added -c option,and it seems that zeCommandListAppendLaunchKernel aborted
image

And another error occurs when the set bash cmd as below:
LD_PRELOAD=/home/yitian/wyt/unitrace1/pti-gpu/tools/unitrace/build/libunitrace_tool.so /home/yitian/wyt/unitrace1/pti-gpu/tools/unitrace/build/unitrace --chrome-kernel-logging --chrome-device-logging python test.py
image

image

When i use ulimit -s to reset stack size with bigger value, the above issues still ocur

And it seems there is something wrong with the unitrace tool,how can i fix it?

@Sarbojit2019
Copy link
Contributor

@yitian1031, may I know if you are running it under conda environment? It yes, can you build the tool fresh and try to run?
We have seen some time different conda environments are having different libraries linked hence building in one and running in other may cause issues.

@yitian1031
Copy link
Author

yitian1031 commented Apr 22, 2024

@yitian1031, may I know if you are running it under conda environment? It yes, can you build the tool fresh and try to run? We have seen some time different conda environments are having different libraries linked hence building in one and running in other may cause issues.

I run under a conda environment, and I rebuild the tool via the latest code, and the tool can not run this time:
image

image

@zma2
Copy link
Contributor

zma2 commented Jun 24, 2024

@yitian1031 Please check the version of libstdc++.so in you conda env. If it is lower than 6.0.30, you need to upgrade it at least 6.0.30.

Also you don't need to preload the libunitrace_tool.so.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants