-
-
Notifications
You must be signed in to change notification settings - Fork 344
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
"Guest mode", for cohabitation with Qt etc. #1551
Conversation
Codecov Report
@@ Coverage Diff @@
## master #1551 +/- ##
==========================================
+ Coverage 99.68% 99.69% +0.01%
==========================================
Files 110 111 +1
Lines 13498 13865 +367
Branches 1027 1059 +32
==========================================
+ Hits 13455 13823 +368
+ Misses 28 27 -1
Partials 15 15
|
Basic outlineChanges to I/O backendsWe split the old I also had to add a Changes to core run loopThe core run loop stays the same, except that we replace: runner.io_manager.handle_io(timeout) with events = yield timeout
runner.io_manager.process_events(events) So that means our core run loop is now a generator (!), and it requires some driver code to read out the The normal The "guest mode" driver is a bit cleverer: it arranges to push the We keep a strict alternation: either the I/O thread is running looking for I/O, or the Trio scheduler is running (or waiting to run) on the host loop. You never have both running at once. Originally I was imagining something a bit cleverer, where the I/O thread runs continuously with infinite timeout, and just streams back I/O events to Trio as they happen. The problem with this is that if we're not using the Oh right, host-triggered scheduler ticks: we want to allow the host loop code to invoke synchronous Trio APIs, like The hack I used to make it work is: if a task becomes scheduled, or the next deadline changes, at a time when the I/O thread is running, then we force the I/O thread to wake up immediately, which triggers a scheduler tick, which will pick up whatever changes were made by the host. This is the most awkward part of this approach, and why I had to add the new (Note: I'm not 100% sure I got all the details of this state machine correct yet...) However, even in the "cleverer" approach where timeouts are managed on the main thread, you still need the equivalent of a Anyway, so this approach just seemed simpler all around, and hopefully not too inefficient. There are a handful of extra syscalls and thread transitions compared to the "fancy" approach, but AFAICT it ought to be unnoticeable in practice. Though we should confirm empirically :-) |
On further thought, I guess signals are not a big deal. The tricky bit is
So all we have to do is check whether an fd is already registered, and if it is then skip registering ours. Still no idea what to do about control-C though; I guess that's tightly tied into how people want to coordinate the lifetime of the two loops. |
Ha, it turns out that like for everything involving async, Twisted also experimented with this kind of approach. They called it their Apparently they ran into some not-so-obvious problems with it – for example, downloads on Windows couldn't achieve line throughput, which was awkward since fancy download apps are one of the most obvious use cases for combining a GUI library + advanced networking library. There's also a lot different between our approach and theirs, though – different decade, use of Conversation with glyph:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a huge fan of this approach -- it's much simpler than I was imagining this sort of thing would need to be.
Should we support multiple run calls simultaneously in the same host loop? It has some logic, but would make accessing Trio from the host substantially more complicated...
I don't think it should be the default. Maybe an option to start_guest_run()
specifying whether to register the runner in the global run context or not. If you choose "not", you gain the ability to support multiple runs, in exchange for needing to use TrioToken.run_sync_soon to do everything. (And return the TrioToken to make this possible.)
I also had to add a force_wakeup method, which is largely redundant with the existing re-entry task system, but it seemed bad to depend on the re-entry task always running... maybe there's some way to merge these, I'm not sure
One interesting thing this raised for me is that force_wakeup
can be more efficient than WakeupSocketpair because it's able to use backend-specific waking mechanisms. Maybe we can use that sort of thing for run_sync_soon wakeups too? I created #1554 to track this.
Originally I was imagining something a bit cleverer, where the I/O thread runs continuously with infinite timeout, and just streams back I/O events to Trio as they happen.
Yeah, I don't think that approach is worth the substantial complexity increase. The one you've pursued in this PR is a lot easier to reason about for anyone familiar with the "normal" straight-through run loop procedures.
Check how much overhead this adds to regular trio.run mode
If this turns out to be a blocker, note that you can use basically the same trick as greenback
to write a "generator" in which you can yield from a callee. I think the greenlet
library even includes this as one of their examples. That would probably make "guest mode" slower but imply less overhead for normal straight-through mode. (And "guest mode" would depend on greenlet
.)
One thing I'm excited about with this diff is that it should be possible to make trio-asyncio "just work" without needing to declare a loop, and no matter whether you start in trio mode or asyncio mode. That's a big win for allowing Trio libraries to be used in asyncio programs; currently only the reverse of that can really be said to be well-supported.
trio/_core/_run.py
Outdated
assert self.is_guest | ||
try: | ||
timeout = self.unrolled_run_gen.send(self.unrolled_run_next_send) | ||
except StopIteration: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will need to handle other exceptions here somehow, probably by wrapping in TrioInternalError
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, so this is a general question... in trio.run
, if there's some unexpected crash, we convert it to a TrioInternalError
and raise it, cool. In guest mode, I'm not actually sure what the desired behavior even is.
My cop-out in this first draft was to let any unexpected internal errors escape escape from the host loop callback, so the host loop could handle them... however it does that. (Probably dump them on the console and continue on.) That's not totally unreasonable, but maybe there's something better.
I guess ideally, we would call the completion callback with the TrioInternalError
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, "call the completion callback with the TrioInternalError" is what I was imagining. It seems closest to what normal Trio does. Relying on the host loop to do something reasonable with uncaught exceptions seems like a stretch to me based on standard practice in the field...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW, I sys.excepthook
in my PyQt programs and present a dialog. Lots of people I suspect just lose the exceptions and painfully debug.
trio/_core/_run.py
Outdated
|
||
def deliver(events_outcome): | ||
def in_main_thread(): | ||
self.unrolled_run_next_send = events_outcome.unwrap() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe worth storing the outcome directly and using unrolled_run_next_send.send(unrolled_run_gen)
to resume? That way you get to handle exceptions through the same path above
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would add a capture/unwrap cycle to regular mode, but maybe that doesn't matter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're unwrapping in the driver, not inside the generator (so it will resume with throw()
if get_events() raised an exception). Regular mode doesn't need to throw I/O exceptions into the generator, since it's just going to exit with a TrioInternalError anyway. But here the I/O exception would be raised out of the callback running in the host event loop, and the likelihood of getting it propagated properly is not good.
trio/_core/_run.py
Outdated
if events or timeout <= 0: | ||
self.unrolled_run_next_send = events | ||
self.guest_tick_scheduled = True | ||
self.run_sync_soon_threadsafe(self.guest_tick) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe worth a comment here that if you run into throughput problems, you can get more Trio time per unit wallclock time by changing this to be willing to do multiple run loop ticks per wakeup, up to some total time limit.
To allow Trio to efficiently cohabitate with arbitrary other loops.
That comment is great, thank you! |
Interesting point. Actually, thinking about this, I don't think we'd even need greenlet... the greenlet code would look something like: timeout = ...
if runner.is_guest:
events = greenlet_magic_yield(timeout)
else:
events = io_manager.get_events(timeout)
io_manager.process_events(events)
... ...But if we decide all the yield/resumes are a problem, we could do that just as easily with a regular |
(Of course this would mean that the regular mode would have to do |
More wisdom from glyph:
|
OK I think I rearranged enough stuff now that we should be converting any internals errors into We're also now calling Still need to figure out control-C handling. |
Well, that's a fun new CI breakage: MagicStack/immutables#46 I guess we might have to disable the |
After making a working demo with Qt I tried to copy over some code I had for asyncio and twisted that allowed awaiting on Qt signals and got an error. I boiled it down to a (I give up, I've already pulled updates three times while typing out this comment. Failable Qt exampleimport functools
import sys
import threading
import attr
import trio
import PyQt5.QtCore
import PyQt5.QtWidgets
class Runner(PyQt5.QtCore.QObject):
signal = PyQt5.QtCore.pyqtSignal(object)
def __init__(self, *args, **kwargs):
super().__init__(*args, **kwargs)
self.signal.connect(self.slot)
def run(self, f):
print('Runner.run()', threading.get_ident())
self.signal.emit(f)
def slot(self, f):
# TODO: no handling of context argument
print('Runner.slot()', threading.get_ident())
f()
@attr.s(auto_attribs=True)
class QTrio:
application: PyQt5.QtWidgets.QApplication
widget: PyQt5.QtWidgets.QTextEdit
runner: Runner
def run(self):
PyQt5.QtCore.QTimer.singleShot(0, self.start_trio)
return self.application.exec()
async def main(self):
print('QTrio.main()', threading.get_ident())
self.widget.show()
fail = True
for i in range(10):
if fail:
print('a1')
event = trio.Event()
print('b1')
timer = PyQt5.QtCore.QTimer.singleShot(1000, event.set)
print('c1')
await event.wait()
print('d1')
else:
print('a2')
await trio.sleep(1)
print('b2')
print('e')
self.widget.append('{}'.format(i))
print('f')
def start_trio(self):
# TODO: it feels a bit odd not getting anything back here. is there
# really no object worth having access to?
trio.lowlevel.start_guest_run(
self.main,
run_sync_soon_threadsafe=self.runner.run,
done_callback=self.trio_done,
)
def trio_done(self, outcome):
print('---', repr(outcome))
self.application.quit()
def main():
print('main()', threading.get_ident())
qtrio = QTrio(
application=PyQt5.QtWidgets.QApplication(sys.argv),
widget=PyQt5.QtWidgets.QTextEdit(),
runner=Runner(),
)
return qtrio.run()
main() From 9018066 RecursionError: maximum recursion depth exceeded while calling a Python object
From b131958 UnboundLocalError: local variable 'timeout' referenced before assignment
|
Ah, the recursion error is because you need to pass And I think the unbound-local error is because I added some code to report unexpected errors like your |
So after thinking about it a while, here's where I'm leaning on control-C handling (this is what's currently implemented in this PR): We strongly recommend that Trio drive the lifetime of the surrounding loop, since... there's not really any better option. You can't have the surrounding loop directly drive Trio's lifetime, since Trio needs to run until the main task finishes. (You can have the surrounding loop nudge the main task to finish though, e.g. by cancelling stuff. But the actual shutdown sequence is still going to be Trio stops → host loop stops → program exits.) Most loops have terrible stories for control-C, e.g. Python Qt apps tend to ignore it (you can find lots of discussions about this on the internet), asyncio tends to explode messily, etc. Putting these two things together, it seems like most users will want Trio to take the There are a few loops that do do something sensible with control-C, e.g. twisted registers a signal handler that triggers the loop shutdown procedure. But I think this is fine:
So I think we don't even need any configuration knobs here... using our standard control-C logic in guest mode pretty much does the best thing possible in all cases. |
To check for added overhead to regular import time
import trio
LOOP_SIZE = 10_000_000
async def main():
start = time.monotonic()
for _ in range(LOOP_SIZE):
await trio.lowlevel.cancel_shielded_checkpoint()
end = time.monotonic()
print(f"{LOOP_SIZE / (end - start):.2f} schedules/second")
trio.run(main) On my laptop, with current master:
With current master + this PR:
So the walltime is ~1% slower, and it used ~1% more instructions. (I counted these mainly to cross-check against walltime, in case of stuff like weird thermal effects and CPU frequency changing.) This is a very focused microbenchmark on a program that's really doing nothing except running through the scheduler as fast as it can. E.g. if you replace |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a few remaining cosmetic issues and I think this is ready to merge! Please also update the title and PR description to reflect the fact that it's not a WIP anymore.
Addressed all review comments. |
This small bug was introduced in python-triogh-1551 Fixes python-triogh-1621
This small bug was introduced in python-triogh-1551 Fixes python-triogh-1621
This adds
trio.lowlevel.start_guest_run
, which initiates a call totrio.run
"on top of" some other "host" loop. It uses the regular Trio I/O backends, so all features are supported. It does not usepolling, so it's efficient and doesn't burn CPU. All you have to do is provide a threadsafe way to schedule a callback onto the host loop.
Since Trio code and the host loop are sharing the same main thread, it's safe for Trio code to call synchronous host loop functions, and for host code to call synchronous Trio functions.
This is an early draft. It has no docs or tests, and I've only tried the epoll mode.This is ready for review/merge.Fixes: #399
TODO/open questions:
Needs Add thread cache #1545 to be merged first
Check how much overhead this adds to regular
trio.run
modeDouble-check that it's actually reasonably efficient when combined with e.g. Qt
Tests
Docs
How on earth should lifetime management work between the two loops (maybe this is a problem for downstream libraries that use this? but it'd be nice if reasonable options were reasonable) [conclusion: we just need to document what's possible, and make a strong recommendation that when using this mode you start up the host loop, then immediately start up the trio loop, and treat the trio loop exiting as the signal to shut down the host loop]
What on earth should we do about signal handling in general? (share set_wakeup_fd with the host? use the host's run_sync_soon_threadsafe in our signal wakeup path?)
add a non-threadsafe
run_sync_soon
handler?Should we support multiple
run
calls simultaneously in the same host loop? It has some logic, but would make accessing Trio from the host substantially more complicated... (conclusion: not in version 1)Simple demo of Trio and asyncio sharing a thread (also included as
notes-to-self/aio-guest-test.py
):Output: