Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Repeatable job ignores selected queue and runs on default #36

Open
striveforbest opened this issue Aug 7, 2019 · 5 comments
Open

Repeatable job ignores selected queue and runs on default #36

striveforbest opened this issue Aug 7, 2019 · 5 comments
Labels

Comments

@striveforbest
Copy link

I have a Repeatable job created via admin to run every 15 minutes on a scheduled queue. See screenshot1:
screenshot1

However, looking at the Queues in the admin, it's apparent that job is running on the default queue instead. See screenshot2:
screenshot2

Separately, the admin doesn't report successfully finished jobs (besides one repeated job). I can confirm many jobs are succeeding but not showing up in the admin. See screenshot3:
screenshot3

@marianstefi20
Copy link

marianstefi20 commented Oct 8, 2019

This will be a mouthful, but if you want to see the solution, skip this list and go to Solution.

  • In admin.py I can see the RepeatableJobAdmin has in fieldsets RQ Settings with the field queue. The choices are set in the QueueMixin, from the QUEUES constant. From here, the RepeatableJob model should have it's queue set when we click on save. The save method is also overwritten in the BaseJob class ( IMHO, this might not be the best idea, I think Django has save_model, but I might be wrong here) - it basically does a normal save() but first, it unschedules, and then schedules a task back, by calling self.schedule().
  • Looking at RepeatableJob(models.py line 147) we can see that the schedule() actually builds a set of kwargs and then calls self.scheduler().schedule(**kwargs). This scheduler is defined in the BaseJob class, and it's actually django_rq.get_scheduler(self.queue).
  • This get_scheduler, if no RQ with a custom SCHEDULER_CLASS is present in settings, will default to DjangoScheduler, a class defined above get_scheduler, in django_rq/queues.py. This DjangoScheduler will override the _create_job method from Scheduler, to make some checks, but later it will still call the base _create_job.
  • We're now in rq_scheduler/scheduler.py at the _create_job method. We see at the end, that it sets job.origin = queue_name or self.queue_name. This will set the custom queue you've set(due to django_rq.get_scheduler(self.queue) from django_rq_scheduler's scheduler()).
  • Because the rq_scheduler called _create_job inside schedule with commit=False the job won't be saved YET. Some more checks are made, and then, we reach job.save() in scheduler.py. This will use redis hmset and will save a mapping to that particular job id used as redis key. This mapping will contain our origin (because in rq/job.py the method to_dict contains that). In the next line, the job is added in scheduled_jobs_key set. Cool...until now, nothing seems bad.
  • Next, I wanted to see how the jobs are consumed... python manage.py rqscheduler. This is the broker that takes the jobs from redis and puts them in the corresponding queue. This is, of course, a different process, so when this will be initialized the queue will be set to default. To see why, look at rq_scheduler/scheduler.py's __init__(). In this case, no queue_name is specified so the default will be set. We also see that enqueue_job, the method that consumes the jobs will at one point do queue = self.get_queue_for_job(job) which will return immediately, because self._queue is set to the default queue. So all the jobs will go to the default.

Solution: run python manage.py rqscheduler --queue=<name> (of course this is not in the documentation of django-rq).
Funky solution: (I'm almost joking here) fork rq_scheduler and change scheduler.py's get_queue_for_job(self, job) function...just a little:
From this:

def get_queue_for_job(self, job):
        """
        Returns a queue to put job into.
        """
        if self._queue is not None:
            return self._queue
        key = '{0}{1}'.format(self.queue_class.redis_queue_namespace_prefix,
                              job.origin)
        return self.queue_class.from_queue_key(
                key, connection=self.connection, job_class=self.job_class)

To this:

def get_queue_for_job(self, job):
        """
        Returns a queue to put job into.
        """
        if job.origin is not None:
            return job.origin
        if self._queue is not None:
            return self._queue
        key = '{0}{1}'.format(self.queue_class.redis_queue_namespace_prefix,
                              job.origin)
        return self.queue_class.from_queue_key(
                key, connection=self.connection, job_class=self.job_class)

@striveforbest
Copy link
Author

@marianstefi20 thanks for digging in.

I am running scheduler via systemd. I will try to update my systemd service to:

[Unit]
Description=Django-RQ Scheduler Service
After=network.target

[Service]
Environment=DJANGO_SETTINGS_MODULE=proj.settings
Environment=DJANGO_CONFIGURATION=Production
WorkingDirectory=/srv/www/proj/
ExecStart=/srv/www/fuweb/.venv/bin/python /srv/www/proj/manage.py rqscheduler --queue=scheduled

[Install]
WantedBy=multi-user.target

However, I think it should be considered as bug.

@marianstefi20
Copy link

marianstefi20 commented Oct 9, 2019

At a closer inspection, the master branch from rq-scheduler already solved this issue, by removing:

if self._queue is not None:
            return self._queue

from get_queue_for_job.

Try updating rq-scheduler to the latest version. If you look at https://github.com/rq/rq-scheduler/commits/master you will see they added yesterday something (Add queue_name to enqueue_in and enqueue_at), but most importantly, they made a new release 0.9.1. This might do it.

@marianstefi20
Copy link

I've updated rq-scheduler and the problem with the queues has been fixed. I can confirm that the admin doesn't report the finished jobs correctly (I'm manually deleting the jobs, but they still show up).

@striveforbest
Copy link
Author

@marianstefi20 thanks for the update. I will upgrade and re-test. Bummer it still doesn't report the finished jobs correctly.

@g3rd g3rd added the bug label Jan 8, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants