Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

landing_job: fix ordering of jobs in queue (bug 1878666) #384

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 3 additions & 5 deletions landoapi/models/landing_job.py
Original file line number Diff line number Diff line change
Expand Up @@ -248,11 +248,7 @@ def job_queue_query(
grace_cutoff = now - datetime.timedelta(seconds=grace_seconds)
q = q.filter(cls.created_at < grace_cutoff)

# Any `LandingJobStatus.IN_PROGRESS` job is first and there should
# be a maximum of one (per repository). For
# `LandingJobStatus.SUBMITTED` jobs, higher priority items come first
# and then we order by creation time (older first).
q = q.order_by(cls.status.desc(), cls.priority.desc(), cls.created_at)
q = q.order_by(cls.priority.desc(), cls.status.desc(), cls.created_at)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need this change? We already get the highest priority items with the highest priority statuses, right?


return q

Expand Down Expand Up @@ -376,6 +372,8 @@ def transition_status(

if action in (LandingJobAction.FAIL, LandingJobAction.DEFER):
self.error = kwargs["message"]
if action == LandingJobAction.DEFER:
self.priority -= 1
Comment on lines +375 to +376
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way this is written now, any job which has its priority lowered will always be at the back of the queue, meaning the patch effectively needs to wait until the queue is empty before it will be considered for landing. This could cause a situation where someone queues their revision for landing, it doesn't land the first time due to some transient issue, then their landing is stuck in limbo because it has a lower priority than newer revisions that are added to the queue. If the queue is empty often enough it's probably not a big deal, but worth noting here.

As an alternative we could note the last attempt time in the landing job, and filter out revisions that have been deferred for a certain duration before attempting to land again. For example we defer a landing job, and then in the job query we filter out all deferred revisions with a last attempted time within the last 15-20 minutes.


if action == LandingJobAction.LAND:
self.landed_commit_id = kwargs["commit_id"]
Expand Down
1 change: 1 addition & 0 deletions tests/test_landings.py
Original file line number Diff line number Diff line change
Expand Up @@ -399,6 +399,7 @@ def test_lose_push_race(

assert not worker.run_job(job, repo, hgrepo, treestatus)
assert job.status == LandingJobStatus.DEFERRED
assert job.priority == -1


def test_failed_landing_job_notification(
Expand Down