-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
EXSWHTEC-249 - Implement Performance Tests for Memcpy APIs #119
Conversation
Change-Id: I66f0c09e9c7405ec7430b1883e0e89542fdb87a0
Change-Id: I212b82b1b3a78a368b85ea64e338371a34b405f9
Change-Id: Ib455f72b5be77e1a81137d15c07ea41161b16a3e
… definitions Customized Doxygen configuration named DoxyfileTests is added, as well as separate header file used for definition of test groups.
…doxygen_configuration
…o include #if and #ifdef sections in documentation
…emcpy and hipMemcpyAsync
…ize created instead of nullstream
…emcpyFromSymbolAsync
…mcpy2DToArrayAsync
…Memcpy2DFromArrayAsync
…tead of benchmark Configure
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall comments here:
I don't see loops of memcpy for example. Usually we like to loop and average out
I also don't see warm up calls. If you benchmark hipMemcpy, you should call it once at least before timing it for good measure.
Hello @chrispaquot, please check this file here: https://github.com/ROCm-Developer-Tools/hip-tests/pull/119/files#diff-5e0a673157e5c8f6a3c1cd2490e0039de1b789e42040f6f739b85575ca4c635a. You can see that within the Run method of a Benchmark class, the API that is being benchmarked is called several times for warmup and afterwards several times in the loop, which is used for timing and time metrics calculation. |
…tests into performance_common
…tests into memcpy_performance
OK that sounds good with one more question: Iterations is going to be the same for all test cases then? Warmup would most likely be 1 but iterations can change from a test case to another. |
Yes, the number of iterations will be the same for all test cases within one logical group/module that will be executed. The request was to configure the number of warmups/iterations through command line arguments. You can configure specific number of warmups/iterations for each test case using Configure method from Benchmark class. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a bit of problem. The current warm-up/iterations will run all allocations and destructions also. Warmup means it touches all resources and executes exactly the same commands, used in the final measurement. "Iterations" means the execution of the same commands in a loop multiple times with the same resources, used in a warm-up.
TIMED_SECTION_STREAM(kTimerTypeEvent, stream) {
HIP_CHECK(hipMemcpy2DAsync(host_allocation.ptr(), device_allocation.width(),
device_allocation.ptr(), device_allocation.pitch(),
device_allocation.width(), device_allocation.height(),
hipMemcpyDeviceToHost, stream));
}
…warmups/iterations
Review comment addressed within 9ccdf02 and 96bb38d. Allocation guards are moved outside of warmups/iterations, so that the same commands are executed on the same resources. @gandryey, can you please review changes? |
…educe and accumulate
PR has been merged into develop branch via commit e3bac85. |
Change-Id: Ib04fe4dd3efce92d7c7bfc8f0c75abd8e9dfe7be
Implement microbenchmarks for the Memcpy APIs.
The following should be covered:
Depends on: #117