Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add asynchronous concurrent execution #3687

Open
wants to merge 2 commits into
base: docs/develop
Choose a base branch
from

Conversation

matyas-streamhpc
Copy link

No description provided.

@matyas-streamhpc matyas-streamhpc self-assigned this Nov 25, 2024
@neon60 neon60 force-pushed the async-doc branch 2 times, most recently from 1484d67 to f81588d Compare December 2, 2024 08:46
@neon60 neon60 marked this pull request as ready for review December 2, 2024 08:53
@neon60 neon60 force-pushed the async-doc branch 4 times, most recently from fd5af51 to 6a139c6 Compare December 6, 2024 18:18
Copy link

@randyh62 randyh62 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

left comments. Looks good overall.

docs/how-to/hip_runtime_api/asynchronous.rst Outdated Show resolved Hide resolved
docs/how-to/hip_runtime_api/asynchronous.rst Outdated Show resolved Hide resolved
docs/how-to/hip_runtime_api/asynchronous.rst Outdated Show resolved Hide resolved
docs/how-to/hip_runtime_api/asynchronous.rst Outdated Show resolved Hide resolved
docs/how-to/hip_runtime_api/asynchronous.rst Outdated Show resolved Hide resolved
docs/how-to/hip_runtime_api/asynchronous.rst Outdated Show resolved Hide resolved
or from the GPU concurrently with kernel execution. Applications can query this
capability by checking the ``asyncEngineCount`` device property. Devices with
an ``asyncEngineCount`` greater than zero support concurrent data transfers.
Additionally, if host memory is involved in the copy, it should be page-locked

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reference we can provide such as Memory Management or something?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think my comment was related to the "if host memory is involved in the copy, it should be page-locked to ensure optimal performance" text. This seems like it could use a reference to some additional details, or maybe just indicate how page-locked ensures optimal performance?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I misunderstood you.

Yes, you are right, I tried to extend the content and I added a link to host memory.


It is also possible to perform intra-device copies simultaneously with kernel
execution on devices that support the ``concurrentKernels`` device property
and/or with copies to or from the device (for devices that support the

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
and/or with copies to or from the device (for devices that support the
and/or with copies to or from the device (for devices that support the

Are copies to or from the device intra-device copies?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, intra-device copying is data transfer within the same device.

Thank you for drawing my attention to it. The text was misleading, so I rephrased it.

docs/how-to/hip_runtime_api/asynchronous.rst Outdated Show resolved Hide resolved
docs/how-to/hip_runtime_api/asynchronous.rst Outdated Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants