Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EXSWHTEC-249 - Implement Performance Tests for Memcpy APIs #119

Closed
wants to merge 77 commits into from

Conversation

milos-mozetic
Copy link
Contributor

@milos-mozetic milos-mozetic commented Jan 13, 2023

Implement microbenchmarks for the Memcpy APIs.

The following should be covered:

  • Synchronous and asynchronous variations of the APIs
  • Small, medium, and large memory sizes
  • Different copy directions (Host to Host, Host to Device, Device to Device, Device to Host)
  • 1D, 2D, and 3D variations of the APIs
  • From/to Symbol variations of the APIs
  • From/to Array variations of the APIs

Depends on: #117

gargrahul and others added 30 commits October 26, 2022 03:59
Change-Id: I66f0c09e9c7405ec7430b1883e0e89542fdb87a0
Change-Id: I212b82b1b3a78a368b85ea64e338371a34b405f9
Change-Id: Ib455f72b5be77e1a81137d15c07ea41161b16a3e
… definitions

Customized Doxygen configuration named DoxyfileTests is added, as well as separate header file
used for definition of test groups.
…o include #if and #ifdef sections in documentation
Copy link
Contributor

@chrispaquot chrispaquot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall comments here:
I don't see loops of memcpy for example. Usually we like to loop and average out
I also don't see warm up calls. If you benchmark hipMemcpy, you should call it once at least before timing it for good measure.

@milos-mozetic
Copy link
Contributor Author

milos-mozetic commented Jan 24, 2023

Overall comments here: I don't see loops of memcpy for example. Usually we like to loop and average out I also don't see warm up calls. If you benchmark hipMemcpy, you should call it once at least before timing it for good measure.

Hello @chrispaquot, please check this file here: https://github.com/ROCm-Developer-Tools/hip-tests/pull/119/files#diff-5e0a673157e5c8f6a3c1cd2490e0039de1b789e42040f6f739b85575ca4c635a. You can see that within the Run method of a Benchmark class, the API that is being benchmarked is called several times for warmup and afterwards several times in the loop, which is used for timing and time metrics calculation.
This number of warmups and loop iterations used for benchmark metrics can be configured using command line arguments, as requested from AMD. Please check this file here: https://github.com/ROCm-Developer-Tools/hip-tests/pull/119/files#diff-c76ce2197e6cd6361be77f86ff31e6f31e7fd53df6bb88112d6f551aff7f8973.

@chrispaquot
Copy link
Contributor

OK that sounds good with one more question: Iterations is going to be the same for all test cases then? Warmup would most likely be 1 but iterations can change from a test case to another.

@milos-mozetic
Copy link
Contributor Author

milos-mozetic commented Jan 26, 2023

OK that sounds good with one more question: Iterations is going to be the same for all test cases then? Warmup would most likely be 1 but iterations can change from a test case to another.

Yes, the number of iterations will be the same for all test cases within one logical group/module that will be executed. The request was to configure the number of warmups/iterations through command line arguments. You can configure specific number of warmups/iterations for each test case using Configure method from Benchmark class.

Copy link
Contributor

@gandryey gandryey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a bit of problem. The current warm-up/iterations will run all allocations and destructions also. Warmup means it touches all resources and executes exactly the same commands, used in the final measurement. "Iterations" means the execution of the same commands in a loop multiple times with the same resources, used in a warm-up.

      TIMED_SECTION_STREAM(kTimerTypeEvent, stream) {
        HIP_CHECK(hipMemcpy2DAsync(host_allocation.ptr(), device_allocation.width(),
                                   device_allocation.ptr(), device_allocation.pitch(),
                                   device_allocation.width(), device_allocation.height(),
                                   hipMemcpyDeviceToHost, stream));
      }

@milos-mozetic
Copy link
Contributor Author

There is a bit of problem. The current warm-up/iterations will run all allocations and destructions also. Warmup means it touches all resources and executes exactly the same commands, used in the final measurement. "Iterations" means the execution of the same commands in a loop multiple times with the same resources, used in a warm-up.

      TIMED_SECTION_STREAM(kTimerTypeEvent, stream) {
        HIP_CHECK(hipMemcpy2DAsync(host_allocation.ptr(), device_allocation.width(),
                                   device_allocation.ptr(), device_allocation.pitch(),
                                   device_allocation.width(), device_allocation.height(),
                                   hipMemcpyDeviceToHost, stream));
      }

Review comment addressed within 9ccdf02 and 96bb38d. Allocation guards are moved outside of warmups/iterations, so that the same commands are executed on the same resources. @gandryey, can you please review changes?

@rakesroy
Copy link
Contributor

PR has been merged into develop branch via commit e3bac85.

@rakesroy rakesroy closed this Feb 26, 2024
rocm-ci pushed a commit that referenced this pull request Feb 26, 2024
Change-Id: Ib04fe4dd3efce92d7c7bfc8f0c75abd8e9dfe7be
rocm-ci pushed a commit that referenced this pull request Feb 26, 2024
- #119
- #151
- #57
- #58
- #59
- #60
- #99
- #139
- #152
- #48
- #54
- #53
- #24
- #23
- #22
- #21
- #20
- #14
- #8

Change-Id: I1eea54cd1436f3ddbfd5c1b3b2f672eb81d03cd4
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants