create_task() recommendation seems like bad practice #104091

Dreamsorcerer · 2023-05-02T15:29:05Z

Documentation

My understanding of asyncio is that all tasks etc. should be awaited on. If this is not done, then exceptions in tasks will never be seen and asyncio will give an "exception was not retrieved" warning.

The documentation currently recommends this code for a fire-and-forget task creation:

background_tasks = set()
for i in range(10):
    task = asyncio.create_task(some_coro(param=i))
    background_tasks.add(task)
    task.add_done_callback(background_tasks.discard)

https://docs.python.org/3/library/asyncio-task.html#asyncio.create_task

This seems like bad practice as the tasks are never awaited (another problem might be if this is copied inside a function, won't the set get garbage collected after the function ends?). To my knowledge, there is no trivial way to do fire-and-forget tasks using stdlib and I have always recommended users to try the aiojobs library to handle this. e.g. In the aiohttp documentation:
https://docs.aiohttp.org/en/latest/web_advanced.html#web-handler-cancellation
(Scroll down to the second warning, and the paragraph preceding it).

I feel that this example should probably be scrapped and possibly a recommendation to use aiojobs for these tasks may be better.

I'd also suggest including the text of both error messages ("exception not retrieved" and "Task was destroyed"), so that people putting that error into a search engine are likely to end up on this page.

The text was updated successfully, but these errors were encountered:

carljm · 2023-05-02T16:35:01Z

I think there's a misunderstanding here about what create_task does. Which is understandable: the name create_task is arguably misleading by omission. It isn't just a pure function that creates and returns a Task instance; it creates a task and schedules it to the event loop. (See the documentation.)

This means the coroutine doesn't need to be explicitly awaited, because the event loop will take care of ensuring that it gets waited on. ~~The event loop also keeps a reference to the Task object, which keeps a reference to the coroutine, so nothing will be garbage collected until it's done running.~~ (Edit: this is wrong.)

So this documented recommendation for fire-and-forget task creation does work fine, and is not bad practice (except inasmuch as TaskGroups are probably a better option in most cases.)

Dreamsorcerer · 2023-05-02T16:54:56Z

This means the coroutine doesn't need to be explicitly awaited, because the event loop will take care of ensuring that it gets waited on.

No, it ensures it executes, not that it get awaited on. You can't get an "exception not retrieved" if the code hasn't run to create an exception...

Simple reproducer:

>>> import asyncio
>>> async def foo():
...  raise RuntimeError()
... 
>>> async def main():
...  asyncio.create_task(foo())
...  await asyncio.sleep(1)
... 
>>> asyncio.run(main())
Task exception was never retrieved
 ...

Well written code should never see "Task exception was never retrieved".

So this documented recommendation for fire-and-forget task creation does work fine, and is not bad practice (except inasmuch as TaskGroups are probably a better option in most cases.)

TaskGroups can only be used for this if you create some kind of global TaskGroup for your application that you keep around till the application ends. This can work, but in something like a server application, I think you'd only get the exceptions when the server shuts down, which could be weeks after the exception happened. So, for server applications, it'd still be best to use aiojobs or similar.

carljm · 2023-05-02T17:05:30Z

Sorry, I did misunderstand your report.

The name "fire-and-forget" specifically suggests that you don't care if it succeeds or fails. If you are going to check the result/exception, then it isn't really fire and forget.

I'm not sure if we should be recommending a specific third-party library here, but it may make sense to add a note in the documentation that fire-and-forget means you won't know if it succeeded or failed, and you may get unretrieved exception warnings in case of failure.

I'll reopen this so asyncio experts can weigh in on what, if anything, should be improved here.

Dreamsorcerer · 2023-05-02T17:09:53Z

Yeah, I'm not too sure about linking to a third-party library either, I just don't have a better suggestion (although TaskGroup can certainly be recommended in some circumstances).

Good engineering practice is not to ignore all exceptions though, so I'm sceptical that was the intended design, and not just something that was overlooked. Typically, a user wants to run tasks in the background without waiting for it to complete or needing to retrieve the result. That doesn't mean they don't want to know when they've broken their code and it's not running anything...

gvanrossum · 2023-05-02T17:27:33Z

The coroutine running in the background task could contain a try/except clause that catches all exceptions and logs them. You could easily make a decorator that does this for your application so you don't have to repeat the try/except/log boilerplate in every background task you write.

We're working on additional flexibility for TaskGroup that will make this pattern simpler. (See e.g. https://discuss.python.org/t/revisiting-persistenttaskgroup-with-kotlins-supervisorscope/18417)

Dreamsorcerer · 2023-05-02T17:50:53Z

We're working on additional flexibility for TaskGroup that will make this pattern simpler. (See e.g. https://discuss.python.org/t/revisiting-persistenttaskgroup-with-kotlins-supervisorscope/18417)

Right, I think if TaskGroup starts supporting logging exceptions as they occur, then it should be able to replace aiojobs.Scheduler.

So, I think we should be recommending a long-lived TaskGroup for fire-and-forget tasks then, instead of the currently documented workaround.

It might also be worth having a TaskGroup.close() (and .start()?) to support this use case too. e.g. In startup/shutdown hooks:

async def on_startup(app):
    app["tg"] = asyncio.TaskGroup()
async def on_shutdown(app):
    await app["tg"].close()

In aiohttp, we have cleanup_ctx, where we'd use async with, but I'm sure there are frameworks without that kind of hook.

(Edit: I see discussion of a .shutdown() in that conversation, so maybe that's already coming).

gvanrossum · 2023-05-03T06:52:54Z

TaskGroup itself won't be logging exceptions, you will have to write your own exception handler or done-callback for that.

For TaskGroup.start() and .close() you can call __aenter__() and __aexit__() directly (these are public APIs).

I don't see an action item here.

Dreamsorcerer · 2023-05-03T11:27:33Z

TaskGroup itself won't be logging exceptions, you will have to write your own exception handler or done-callback for that.

But, it will default to asyncio.get_running_loop().call_exception_handler(), right? That's the same as aiojobs, so it should still work as a simple replacement.

For TaskGroup.start() and .close() you can call __aenter__() and __aexit__() directly (these are public APIs).

Fair enough, I think generally people discourage use of calling magic methods directly, so I was thinking users will be more comfortable with a public (I mean, no _ prefix) method. I think that'd also be consistent with other objects too (e.g. file-like objects have .close() in addition to __exit__()). Either way, it'll work though.

gvanrossum · 2023-05-09T00:50:40Z

~~> But, it will default to asyncio.get_running_loop().call_exception_handler(), right? That's the same as aiojobs, so it should still work as a simple replacement.~~

I would like to push back on this a little. Why do you want it to call call_exception_handler()? Looking over the asyncio source code, that function is generally only called when there is no alternative, e.g. when an exception is caught in a finalizer.

~~The regTaskGroup doesn't call it when a task crashes~~

[Submitted by mistake, see full comment below.]

gvanrossum · 2023-05-09T00:51:06Z

Whoops, closed by mistake. Will resume in a bit.

gvanrossum · 2023-05-09T01:02:07Z

But, it will default to asyncio.get_running_loop().call_exception_handler(), right? That's the same as aiojobs, so it should still work as a simple replacement.

I would like to push back on this a little. Why do you want it to call call_exception_handler()? Looking over the asyncio source code, that function is generally only called when there is no alternative, e.g. when an exception is caught in a finalizer.

The regular TaskGroup doesn't call it when a task fails, so why should the variant for background tasks call it? I still think that there are already plenty of other ways for the user to arrange for task failures to be logged, e.g. by wrapping the task's coroutine in a handler that logs the exception, or by adding a done-callback to do the same. (@achimnol seems to agree, his latest Supervisor design does not do this.)

(Though TBH I'm not 100% sold on my own argument here, so maybe you have a good argument?)

Fair enough, I think generally people discourage use of calling magic methods directly, so I was thinking users will be more comfortable with a public (I mean, no _ prefix) method. I think that'd also be consistent with other objects too (e.g. file-like objects have .close() in addition to __exit__()). Either way, it'll work though.

Technically, a dunder is not "private" or "protected" in the sense that a single or double underscore conveys. For files, close() long predated __exit__(). I'm not too keen on violating TOOWTDI here. But maybe the Supervisor class shouldn't be an async context manager, solving that problem.

Regarding the original doc example, I don't necessarily like making all examples industrial-strength, because (to me) the purpose of an example is understanding, not a copy-pastable solution. (You end up with very verbose examples if you strive for the latter.) I'd be okay with a small update to that particular example though that reminds the reader that if the task fails, no exception will be printed anywhere. The user can then decide what to do about that.

Dreamsorcerer · 2023-05-09T16:39:01Z

I would like to push back on this a little. Why do you want it to call call_exception_handler()?

It was mentioned at https://discuss.python.org/t/revisiting-persistenttaskgroup-with-kotlins-supervisorscope/18417/12
So, I thought it was already part of the plan.

I was thinking that aiohttp installs it's own default handler, but now that I look I'm not so sure. I'll give a try later, but basically I'd expect to atleast have exceptions appear on stdout if no log handler is provided.

The regular TaskGroup doesn't call it when a task fails, so why should the variant for background tasks call it?

In that case the exception is reraised and can then be handled by other code, right? Whereas a background task could only be reraised when the application ends. So, in this scenario, I'd expect the exception to be handled immediately and not raise anything on shutdown.

I still think that there are already plenty of other ways for the user to arrange for task failures to be logged, e.g. by wrapping the task's coroutine in a handler that logs the exception, or by adding a done-callback to do the same. (@achimnol seems to agree, his latest Supervisor design does not do this.)

(Though TBH I'm not 100% sold on my own argument here, so maybe you have a good argument?)

My argument is really just to do the expected thing by default. i.e. A user who creates this Supervisor without much knowledge to best practices should not then end up spending hours or days trying to figure out why their code is not behaving correctly, simply because they never even knew an exception was happening in their code.

It would also just be convenient, not having to add (and remember to add) a custom decorator or whatever everytime you quickly create a task to throw into the Supervisor.

But maybe the Supervisor class shouldn't be an async context manager, solving that problem.

I'm not too bothered either way, as I say, we have cleanup_ctx in aiohttp, so it'll either look like:

async with Supervisor() as s:
    app["supervisor"] = s
    yield

or

app["supervisor"] = Supervisor()
yield
await app["supervisor"].close()

gvanrossum · 2023-05-09T17:41:22Z

I would like to push back on this a little. Why do you want it to call call_exception_handler()?

It was mentioned at https://discuss.python.org/t/revisiting-persistenttaskgroup-with-kotlins-supervisorscope/18417/12 So, I thought it was already part of the plan.

I have to apologize, my Discourse mailing list mode was off for a while and I missed that post (and many others).

But, in https://discuss.python.org/t/asyncio-tasks-and-exception-handling-recommended-idioms/23806/11 it seems @achimnol is reversing course; I found no trace of call_exception_handler() in the code linked from there. (Yes, it's called, but only in a branch that the original author of TaskGroup wasn't even should could be reached, according to the comment.)

I was thinking that aiohttp installs it's own default handler, but now that I look I'm not so sure. I'll give a try later, but basically I'd expect to atleast have exceptions appear on stdout if no log handler is provided.

The regular TaskGroup doesn't call it when a task fails, so why should the variant for background tasks call it?

In that case the exception is reraised and can then be handled by other code, right? Whereas a background task could only be reraised when the application ends. So, in this scenario, I'd expect the exception to be handled immediately and not raise anything on shutdown.

I still think that there are already plenty of other ways for the user to arrange for task failures to be logged, e.g. by wrapping the task's coroutine in a handler that logs the exception, or by adding a done-callback to do the same. (@achimnol seems to agree, his latest Supervisor design does not do this.)
(Though TBH I'm not 100% sold on my own argument here, so maybe you have a good argument?)

My argument is really just to do the expected thing by default. i.e. A user who creates this Supervisor without much knowledge to best practices should not then end up spending hours or days trying to figure out why their code is not behaving correctly, simply because they never even knew an exception was happening in their code.

It would also just be convenient, not having to add (and remember to add) a custom decorator or whatever everytime you quickly create a task to throw into the Supervisor.

Okay, you have convinced me. I'd like @achimnol's response.

Dreamsorcerer · 2023-05-10T16:26:05Z

One last point I'd say on the logging front is that for smaller personal projects it's pretty common to not configure logging at all. In that case, it's expected to see exceptions/warnings when you run the application from a terminal or by looking at the web server logs, as they would be sent to stdout/stderr by default.

achimnol · 2023-07-09T06:25:31Z

I've been busy for closing up my company's seris A funding along with several big customer follow-ups. 😞 Recently, I began to look into this issue again, with some rewriting of the aiotools experimentation to make the key concepts in TaskGroup/Supervisor reusable to write various coroutine aggregation patterns.

achimnol/aiotools#58

I agree with that "a user should not end up spending hours or days trying to figure out why their code is not behaving correctly, simply because they never even knew an exception was happening in their code." because I also experienced this many times. 😭

willingc · 2024-06-19T20:56:30Z

Moving this issue status back to To Do as I do not believe that anyone is currently documenting this. But it may be helpful for someone interested in Docs to do so.

Dreamsorcerer added the docs Documentation in the Doc dir label May 2, 2023

carljm closed this as not planned Won't fix, can't repro, duplicate, stale May 2, 2023

carljm reopened this May 2, 2023

carljm added the topic-asyncio label May 2, 2023

github-project-automation bot added this to asyncio May 2, 2023

github-project-automation bot moved this to Todo in asyncio May 2, 2023

gvanrossum closed this as completed May 9, 2023

github-project-automation bot moved this from Todo to Done in asyncio May 9, 2023

gvanrossum reopened this May 9, 2023

github-project-automation bot moved this from Done to In Progress in asyncio May 9, 2023

willingc moved this from In Progress to Todo in asyncio Jun 19, 2024

e-carlin mentioned this issue Aug 7, 2024

fix #7153: raydata enable/disable auto processing of scans from the GUI radiasoft/sirepo#7199

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

create_task() recommendation seems like bad practice #104091

create_task() recommendation seems like bad practice #104091

Dreamsorcerer commented May 2, 2023 •

edited by AlexWaygood

Loading

carljm commented May 2, 2023 •

edited

Loading

Dreamsorcerer commented May 2, 2023 •

edited

Loading

carljm commented May 2, 2023

Dreamsorcerer commented May 2, 2023 •

edited

Loading

gvanrossum commented May 2, 2023

Dreamsorcerer commented May 2, 2023 •

edited

Loading

gvanrossum commented May 3, 2023

Dreamsorcerer commented May 3, 2023

gvanrossum commented May 9, 2023 •

edited

Loading

gvanrossum commented May 9, 2023

gvanrossum commented May 9, 2023

Dreamsorcerer commented May 9, 2023 •

edited

Loading

gvanrossum commented May 9, 2023

Dreamsorcerer commented May 10, 2023

achimnol commented Jul 9, 2023 •

edited

Loading

willingc commented Jun 19, 2024

create_task() recommendation seems like bad practice #104091

create_task() recommendation seems like bad practice #104091

Comments

Dreamsorcerer commented May 2, 2023 • edited by AlexWaygood Loading

Documentation

carljm commented May 2, 2023 • edited Loading

Dreamsorcerer commented May 2, 2023 • edited Loading

carljm commented May 2, 2023

Dreamsorcerer commented May 2, 2023 • edited Loading

gvanrossum commented May 2, 2023

Dreamsorcerer commented May 2, 2023 • edited Loading

gvanrossum commented May 3, 2023

Dreamsorcerer commented May 3, 2023

gvanrossum commented May 9, 2023 • edited Loading

gvanrossum commented May 9, 2023

gvanrossum commented May 9, 2023

Dreamsorcerer commented May 9, 2023 • edited Loading

gvanrossum commented May 9, 2023

Dreamsorcerer commented May 10, 2023

achimnol commented Jul 9, 2023 • edited Loading

willingc commented Jun 19, 2024

Dreamsorcerer commented May 2, 2023 •

edited by AlexWaygood

Loading

carljm commented May 2, 2023 •

edited

Loading

Dreamsorcerer commented May 2, 2023 •

edited

Loading

Dreamsorcerer commented May 2, 2023 •

edited

Loading

Dreamsorcerer commented May 2, 2023 •

edited

Loading

gvanrossum commented May 9, 2023 •

edited

Loading

Dreamsorcerer commented May 9, 2023 •

edited

Loading

achimnol commented Jul 9, 2023 •

edited

Loading