Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Run background task in master process? #3401

Closed
pe224 opened this issue Nov 20, 2018 · 6 comments
Closed

Run background task in master process? #3401

pe224 opened this issue Nov 20, 2018 · 6 comments
Labels

Comments

@pe224
Copy link

pe224 commented Nov 20, 2018

I deploy aiohttp behind gunicorn using multiple workers with the --preload option.
Currently, I can run a background task using its own thread in the master process

import threading
import time
from aiohttp import web

async def handle(request):
    return web.Response(text='OK')

def my_job():
    while True:
        print('Still here')
        time.sleep(3)

app = web.Application()
app.router.add_get('/', handle)

threading.Thread(target=my_job, daemon=True).start()

It feels a bit dirty to use threads when I should be able to just hook into the event loop of the master process.
However, using app.on_startup does not work as the task will then be duplicated by each worker.

Is there a possibility to create a task only in the event loop of the master process or somehow remove the duplication by the workers?

@aio-libs-bot
Copy link

GitMate.io thinks the contributor most likely able to help you is @asvetlov.

Possibly related issues are #1964 (The documentation on "Background tasks" is wrong), #1104 (Rewrite doc section about background tasks), #1921 (Run tasks (fetch urls) inside separate thread), #2745 (Running out of memory ), and #1092 (Allow to register application background tasks within an event loop).

@asvetlov
Copy link
Member

asvetlov commented Nov 20, 2018

With --preload you use a hack by running a thread in the main gunicorn process with executing a bunch of worker processes for aiohttp server handling after that.
In turn on_startup is called from the worker code.

The main question is what memory do you expect to have in my_job? Shared with the master process, with one of a forked worker or something independent (a new process).

I suspect you want to do something more complex than printing a line infinitely.

@pe224
Copy link
Author

pe224 commented Nov 20, 2018

Right, sorry for the missing info.
Ideally I would like to write to memory shared with all workers (worker access being read-only).
It could, in a pinch, be made almost as simple as printing a line when using the file system as temporary storage.

@asvetlov
Copy link
Member

If you don't care about scaling and performance -- you don't need gunicorn at all.
If you do -- you need to think about deploying on multiple nodes anyway.
Running worker processes with message brokers (redis pubsub, rabbitmq, kafka etc.) sounds like a proper solution.

@pe224
Copy link
Author

pe224 commented Nov 23, 2018

Alright, thanks for the big-picture view.
I'll probably have to rethink the architecture as what I had in mind might work, but will be hacky.

@lock
Copy link

lock bot commented Nov 23, 2019

This thread has been automatically locked since there has not been
any recent activity after it was closed. Please open a new issue for
related bugs.

If you feel like there's important points made in this discussion,
please include those exceprts into that new issue.

@lock lock bot added the outdated label Nov 23, 2019
@lock lock bot locked as resolved and limited conversation to collaborators Nov 23, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

3 participants