-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The profiling results for ResNet. #6179
Labels
Comments
Definitely, there is a lot optimization job need to do with Python code..... |
@dzhwinter Use the latest code in develop branch, the commit id : fb91938 , which includes the improvement of |
Some python profiling: run_benchmark()
executor.run()
|
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Config and Env
The profiling results
-DWITH_GPU=ON -DWITH_TIMER=ON -DWITH_MKL=ON -DCMAKE_BUILD_TYPE=Release
The operators needing to optimize
But the mainly computing time of. (There is no stream synchronization between im2col and gemm, so the time for im2col and gemm is not accurate.)im2col
andgemm
is461.405 + 2343.45 =2804.855ms
The time of Python accounts about 22% of total time.
The text was updated successfully, but these errors were encountered: