[src] use CL_PROFILING_COMMAND_END as latency time #67

alohali · 2020-04-16T02:14:54Z

CL_PROFILING_COMMAND_END - CL_PROFILING_COMMAND_QUEUED is real kernel latency

alohali · 2020-04-16T02:26:13Z

Is it more accurate to test kernel latency with CL_PROFILING_COMMAND_END - CL_PROFILING_COMMAND_QUEUED and run a extreme small kernel?
see >20us difference on several ARM MALI GPU device.

krrishnarraj · 2020-04-16T13:18:15Z

Thanks.
I agree with the small kernel part.
I am seeing more latency for cpu platforms like pocl. How can 'CL_PROFILING_COMMAND_END - CL_PROFILING_COMMAND_QUEUED' give better accuracy wrt CL_PROFILING_COMMAND_START?

alohali · 2020-04-23T01:11:26Z

Thanks.
I agree with the small kernel part.
I am seeing more latency for cpu platforms like pocl. How can 'CL_PROFILING_COMMAND_END - CL_PROFILING_COMMAND_QUEUED' give better accuracy wrt CL_PROFILING_COMMAND_START?

Because kernel launch latency contains pre-launch, post-launch latency and other execution latency. CL_PROFILING_COMMAND_START - CL_PROFILING_COMMAND_QUEUED only calculates pre launch parts but not post launch parts. CL_PROFILING_COMMAND_END - CL_PROFILING_COMMAND_QUEUED includes both pre and post. The real kernel execution time is almost zero.

nchristensen · 2022-10-07T18:46:10Z

From https://stackoverflow.com/questions/39924433/opencl-events-ambiguity it seems to me that CL_PROFILING_COMMAND_SUBMIT - CL_PROFILING_COMMAND_START is the pre-execution latency. CL_PROFILING_COMMAND_COMPLETE was added in OpenCL 2.0. I'm guessing CL_PROFILING_COMMAND_COMPLETE - CL_PROFILING_COMMAND_END is the post-execution latency.

There may also a lower bound on CL_PROFILING_COMMAND_END - CL_PROFILING_COMMAND_START which might be another form of latency.

So CL_PROFILING_COMMAND_COMPLETE - CL_PROFILING_COMMAND_SUBMIT on very small kernel may be a way to measure the latency.

alohali added 2 commits April 15, 2020 19:12

[src] use CL_PROFILING_COMMAND_END as latency time

0615ec6

[src]use smaller wgs, lgs for kernel latency

9491966

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[src] use CL_PROFILING_COMMAND_END as latency time #67

[src] use CL_PROFILING_COMMAND_END as latency time #67

alohali commented Apr 16, 2020

alohali commented Apr 16, 2020

krrishnarraj commented Apr 16, 2020

alohali commented Apr 23, 2020

nchristensen commented Oct 7, 2022 •

edited

Loading

[src] use CL_PROFILING_COMMAND_END as latency time #67

Are you sure you want to change the base?

[src] use CL_PROFILING_COMMAND_END as latency time #67

Conversation

alohali commented Apr 16, 2020

alohali commented Apr 16, 2020

krrishnarraj commented Apr 16, 2020

alohali commented Apr 23, 2020

nchristensen commented Oct 7, 2022 • edited Loading

nchristensen commented Oct 7, 2022 •

edited

Loading