-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added submit_keep_args_alive #1395
Conversation
3a92246
to
19b628e
Compare
View rendered docs @ https://intelpython.github.io/dpctl/pulls/1395/index.html |
Array API standard conformance tests for dpctl=0.14.6dev5=py310ha25a700_9 ran successfully. |
Array API standard conformance tests for dpctl=0.14.6dev5=py310ha25a700_10 ran successfully. |
Array API standard conformance tests for dpctl=0.14.6dev5=py310ha25a700_12 ran successfully. |
Array API standard conformance tests for dpctl=0.14.6dev5=py310ha25a700_13 ran successfully. |
Array API standard conformance tests for dpctl=0.14.6dev5=py310ha25a700_15 ran successfully. |
Array API standard conformance tests for dpctl=0.14.6dev5=py310ha25a700_16 ran successfully. |
6daca36
to
1b393d4
Compare
Array API standard conformance tests for dpctl=0.15.0rc1=py310ha25a700_14 ran successfully. |
4c8ae59
to
bb473da
Compare
Array API standard conformance tests for dpctl=0.15.0rc1=py310ha25a700_30 ran successfully. |
bb473da
to
7c51f2f
Compare
Array API standard conformance tests for dpctl=0.15.0rc2=py310ha25a700_17 ran successfully. |
11b92bb
to
7c51f2f
Compare
Array API standard conformance tests for dpctl=0.15.0rc2=py310ha25a700_24 ran successfully. |
Array API standard conformance tests for dpctl=0.15.0rc2=py310ha25a700_25 ran successfully. |
7c51f2f
to
7f79887
Compare
Array API standard conformance tests for dpctl=0.15.0rc3=py310ha25a700_13 ran successfully. |
7f79887
to
e303eaa
Compare
This is an adaptation of pipelining technique shared by @mbecker in https://github.com/IntelPython/numbda_dpex/issues/147 This is built to work with async-ref-count-increment branch IntelPython/dpctl#1395 which implements asynchronous memcpy, asynchronous submit and asynchronous keep_arg_alve task submission.
Array API standard conformance tests for dpctl=0.15.1dev0=py310ha25a700_16 ran successfully. |
Array API standard conformance tests for dpctl=0.15.1dev0=py310ha25a700_17 ran successfully. |
f822827
to
ed67ac8
Compare
Array API standard conformance tests for dpctl=0.15.1dev0=py310ha25a700_18 ran successfully. |
ed67ac8
to
802ead7
Compare
Array API standard conformance tests for dpctl=0.15.1dev0=py310ha25a700_19 ran successfully. |
Array API standard conformance tests for dpctl=0.15.1dev0=py310ha25a700_44 ran successfully. |
5b3643c
to
cebd7f9
Compare
Array API standard conformance tests for dpctl=0.15.1dev0=py310ha25a700_53 ran successfully. |
Usage: q = dpctl.SyclQueue() ... e = q.submit(krn, args, ranges) ht_e = q._submit_keep_args_alive(args, [e]) .... ht_e.wait()
Instead delegated the task of Python object life-time management to the user via use of _submit_keep_args_alive method
The SyclQueue.submit has become synchronosing, although it still returns a SyclEvent (with exectuion_status always complete)
This is the copy operation where one can specify list of events the copy operation requires before start of its execution. DPCTLQueue_MemcpyWithEvents( __dpctl_keep DPCTLSyclQueueRef QRef, void *dst, const void *src, size_t nbytes, const DPCTLSyclEventRef *depEvents, size_t nDE ) Uses this function in tests.
Also extends `dpctl.SyclQueue.memcpy` to allow arguments to be objects that expose buffer protocol, allowing `dpctl.SyclQueue.memcpy` and `dpctl.SyclQueue.memcpy_async` to be used to copy from/to USM-allocation or host buffer.
``` In [9]: timer = dpctl.SyclTimer() In [10]: with timer(q): ...: y = dpt.linspace(1, 2, num=10**6, sycl_queue=q) ...: In [11]: timer.dt Out[11]: (0.0022024469999450957, 0.002116712) In [12]: with timer(q): ...: x = dpt.linspace(0, 1, num=10**6, sycl_queue=q) ...: In [13]: timer.dt Out[13]: (0.004531950999989931, 0.004239664000000001) ```
The object can unpack into a tuple, like before, but it prints with annotation of what each number means, and provides names getters. with timer(q): code dur = timer.dt print(dur) # outputs (host_dt=..., device_dt=...) dur.host_dt # get host-timer delta dur.device_dt # get device-timer delta hdt, ddt = dur # unpack into a tuple
I tested |
cebd7f9
to
10722d4
Compare
Array API standard conformance tests for dpctl=0.15.1dev0=py310ha25a700_56 ran successfully. |
Added
dpctl.SyclQueue._submit_keep_args_alive(args, events)
that increments reference count ofargs
object (typically a sequence of arguments an asynchronous task is operating on), ensuring thatargs
object is not garbage collected until afterevents
signal that tasks working on these objects complete their execution.