-
-
Notifications
You must be signed in to change notification settings - Fork 884
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
background_jobs crashes makes other instances not receive federation updates #1820
Comments
The only thing I can think of is the arbiter the jobs are running in is dying for some reason. You could maybe get around this by running the jobs on their own arbiter: let arbiter = actix_web::rt::Arbiter::new();
WorkerConfig::new(|| MyState {
client: Client::default(),
})
.register::<SendActivityTask>()
.start_in_arbiter(&arbiter, queue_handle.clone()); |
There's also a |
I've just published a new version of background-jobs-actix which should warn when workers drop regardless of cleanliness, so you can try updating to that version to catch when they drop
you should be able to pull it in automatically with |
Thx, I'll try to get new builds out for this today, including the arbiter fix. I'm not sure what's causing the crash, but unfortunately there's nothing in the logs. It crashed sometime last night, and new actions aren't ending up in the activity queue.
|
@asonix background-jobs and background-jobs-actix are still at 0.10.0: https://crates.io/crates/background-jobs/versions Let me know when you get those updated, or if I can go to a git version. |
nm, I downgraded to |
Okay I deployed |
yeah 0.10 switched log out for tracing so i figured until y'all move your logging to tracing as well I'll release fixes for 0.9 |
The arbiter seems to do it, I'll re-open if we encounter this again. |
This bug cropped up again after 3 weeks of running fine. I didn't run the correct log check to make sure its good, but its :
I'll make sure to check that next time we have another issue. |
Theres a new version of background-jobs which we should upgrade to. |
Can you please take a look again? Looks like my instance fapsi.be doesn't get any updates from lemmy.ml at the moment. |
Seems like it crashed yesterday, with nothing in the logs again:
These messages are pretty common, but its the last one:
I've saved the entire log now, and restarted lemmy. Okay I've searched the log for a lot of different terms, and unfortunately it seems that background_jobs crashes without an error message. @asonix |
Sweet, sorry about this again. |
@dessalines are you sure there's no background jobs log about the Ticker stopping? if there's not, then this doesn't look like a crash, it just looks like it stops doing anything |
No |
@dessalines does it look like there could be jobs that started but didn't finish? Probably not since the last log line involves a job finishing but |
Is it possible, that this bug hit lemmy.ml again? |
I wonder if this is related:
I think maybe a bit more logging is in order to confirm where the problem lies, though |
@kromonos k I just did a lemmy restart, it temporarily fixed it. I'll re-add restarts to our nightly cron job until we can figure out why this keeps happening. |
I think this problem was caused by too low worker count and wrong implementation which meant that failed activity sends would not be retried. At least i havent heard of any similar problems since fixing those. |
This seems to happen on lemmy intemittently, sometimes after a week or so of running fine. background_jobs will stop showing up in the logs.
cc @asonix @Nutomic
This issue happened again, twice in one day. A restart seemed to fix it again.
edit: I'm tailing the log right now:
The text was updated successfully, but these errors were encountered: