Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Delayed jobs not trigerring every other launch #168

Closed
pavelserbajlo opened this issue Apr 2, 2020 · 16 comments
Closed

Delayed jobs not trigerring every other launch #168

pavelserbajlo opened this issue Apr 2, 2020 · 16 comments

Comments

@pavelserbajlo
Copy link

Let say I'm scheduling a delayed task (10 seconds), which is normally getting triggered thanks to QueueScheduler listening. This works fine.

Now imagine I schedule the delayed task again, I quit the process and start it again, just so that it is ready before the job would trigger. But it never will, even though I can see in in queue.getDelayed().

I restart the app again an voila! Now it triggers it (since it probably found out it should have been triggered already).

Am I missing some config or anything else important? Thanks for any help.

@manast
Copy link
Contributor

manast commented Apr 4, 2020

That sound strange, could you provide a small test case that demonstrate this problem so that I can take a look into it?

@matus123
Copy link

Hello,
i am having the same issue as OP. If i stop and immediately start application it behave the same way as mentioned above... but when i stop application and wait some time (30-60sec) and then start application.. everything works as excpected.

I spent some time trying to figure out where might be problem. When it stops to work...

//readDelayedData function queue-scheduler.ts
data = await client.xread('BLOCK', blockTime, 'STREAMS', key, streamLastId);

always returns null

I am using these dependencies.

node: v13, v12
bullmq: 1.8.4
ioredis: 4.9.0

redis-docker: redis:5.0.7-alpine3.11

gist:
https://gist.github.com/matus123/824c8702efd2c75b02ca10c858bbc68f

@matus123
Copy link

If i create additional QueueSchedulers for each queue, it works even on immediate restarts.

@manast Is it okay to create multiple QueueSchedulers for queue ? Or should i expect some weird behavior ?

gist:
https://gist.github.com/matus123/7c9f7650d2dc707132e507de2f96dc32

@manast
Copy link
Contributor

manast commented Apr 23, 2020

it is possible to have more than one for redundancy, but it should work with just one too.

@gWOLF3
Copy link

gWOLF3 commented May 5, 2020

@pavelserbajlo Were you able to figure out a good workaround for this?

I am considering just implementing the delay in memory but that is not preferable bc it defeats the whole purpose of using an redis queue in memory which can be backed by persisted storage.

This seems like a critical feature that is not working properly.

@pavelserbajlo
Copy link
Author

I was evaluating if bullmq is stable enough for my needs and reported my findings here. Since it's not there yet, I'm patiently using bull instead :)

@lricoy
Copy link
Contributor

lricoy commented May 26, 2020

I can confirm this happens to me as well but having more than one QueueScheduler appears to solve it. In my case it was not only for delayed jobs but if I restart the app when a long-running job is executing it was not being marked as "stalled" as well but would stay on the "active" state.

@PaulGrimshaw
Copy link

PaulGrimshaw commented May 26, 2020

I confirm i have the above behaviour (repeated jobs not starting). I have also seen problems with repeated jobs dropping out (no error, just not continuing, see above issue)

@manast
Copy link
Contributor

manast commented May 26, 2020

I am looking into this...

@manast manast closed this as completed in 0c5db83 May 28, 2020
manast pushed a commit that referenced this issue May 28, 2020
…-05-28)

### Bug Fixes

* **scheduler:** divide timestamp by 4096 in update set fixes [#168](#168) ([0c5db83](0c5db83))
@manast
Copy link
Contributor

manast commented May 28, 2020

🎉 This issue has been resolved in version 1.8.10 🎉

The release is available on:

Your semantic-release bot 📦🚀

@ifokeev
Copy link

ifokeev commented Jul 22, 2020

So as I see from issues and my own experience there is a BIG BIG problem with stalled jobs. That's hard to debug and understand what's going on and why it doesn't work.

Now i'm having the issue with repeatable jobs which going stalled just when they launched. I tried different options for QueueScheduler, Worker, etc – no luck. Now that's the main reason to switch off bullmq

@manast
Copy link
Contributor

manast commented Jul 22, 2020

@ifokeev do you have some code snippet that reproduces your issue?

@ifokeev
Copy link

ifokeev commented Jul 23, 2020

@manast no, because I don't understand the issue really. I have concurrency: 1 and about 6 queues with repeatable job (every 1 second, for example). It freezes randomly some of the queues or all of them. I tried to clean redis and different queue options, but no luck. I see only "stelled job" errors and maxStalledCount doesn't help. Looks like the scheduler or event loop has some bugs inside.

@manast
Copy link
Contributor

manast commented Jul 23, 2020

What are your jobs doing? stalled jobs happens when your processor is doing a long CPU intensive task.

@ifokeev
Copy link

ifokeev commented Jul 23, 2020

@manast they are marked as stalled before running the real task so there are no CPU intensive tasks

@manast
Copy link
Contributor

manast commented Jul 24, 2020

A job cannot be stalled until it is active, it is impossible. If you can provide some code we can look into it, I am afraid there must be some issue with your code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants