-
Notifications
You must be signed in to change notification settings - Fork 121
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement cron-style, recurring tasks #104
Comments
I am excited about this project and would love to contribute. Before I get too far down the wrong path, I'd like to clarify a few aspects to ensure alignment with the project's vision. Proposed Format for Cron-like Tasks: I suggest the following YAML structure for defining recurring tasks: recurring_tasks:
- task: "DailyReport.generate"
schedule: "every day at noon"
- task: "DatabaseCleanup.run"
schedule: "every 10 minutes"
- task: "SendBroadcast.start"
schedule: "0 4 * * 0" This format aims to provide flexibility in scheduling any arbitrary class method. It supports both a human-readable syntax and traditional cron syntax for broader flexibility. Questions for Clarification: Dependency on Fugit: Is the project open to depending on external gems like Fugit for parsing the cron syntax? Or is there a preference for a simpler, in-house solution to avoid additional dependencies? Task Execution Mechanism: Are there any specific considerations or constraints regarding how these tasks should be executed within the project's existing architecture? It seems like spinning the idea would be to spin up a dispatcher for each scheduled task but maybe you have already thought through this and another suggestion. Your guidance on these points will help ensure that my contributions align well with the project's roadmap and existing design principles. Looking forward to your feedback and excited to contribute to this feature. |
Just 2¢ as a user of GoodJob looking to migrate to SolidQueue when the time comes, I really like GoodJob's cron-style support: https://github.com/bensheldon/good_job#cron-style-repeatingrecurring-jobs I most like:
I mainly like that it feels like a lower mental overhead in that only jobs get queued, whether now or on a schedule. But there are arguably some objective measurements around that, like I can customize my |
@ianterrell Really appreciate you sharing your views on GoodJob's cron-style support. It's awesome to get insights from someone familiar with GoodJob looking at the transition to SolidQueue. I've used GoodJob in the past but never for cron tasks. My Thoughts: Testability: I see the appeal in GoodJob's approach of only scheduling jobs. My idea of using class methods like class DailyReport
def self.generate
# ...
DailyReportJob.perform_later(Time.now, answer: 42)
end
end This keeps the YAML config straightforward while packing the logic neatly in the method. Simplicity: The YAML stays simple and intuitive: recurring_tasks:
- task: "DailyReport.generate"
schedule: "every day at noon" We maintain a clean setup, but with the power of a being able to put any complex task calling logic in the class method backing it. This also allows calling jobs with no arguments and doesn't require the user to add a class method if none is required. recurring_tasks:
- task: "DailyReport.perform_later"
schedule: "every day at noon" Safety with Input Validation: Your point about calling "arbitrary" code is a valid one. We could validate the input to ensure it's just a class name and a method. This would prevent weird errors or the risk of evaluating arbitrary code. While I totally get the straightforwardness of the GoodJob setup and putting all the configuration for scheduled jobs in one place, I think the class method approach could give us an just as much if not more flexibility and simplicity while making it possible to ensure those aspects of the job logic remain testable. |
I've been giving this more thought and realized there's another snag with the class method approach. Imagine if someone calls a method that unexpectedly bogs down the system or even grinds it to a halt. Given this, it seems like it's important that the scheduler-dispatcher only backgrounds jobs. Calling a class method wouldn't allow it to know if it's safe to call so... Thinking of a Fix: Maybe the key is to simplify things. How about we put some constraints in place? Like, the scheduler could always fire off a job with no arguments, as in recurring_tasks:
- job: "DailyReport"
schedule: "every day at noon" In this scenario, the scheduler automatically calls But Here's the Catch: This solution definitely makes things safer and simpler. But I'm wondering, are we straying from the original goal of this feature? Is this sacrificing too much flexibility for the sake of simplicity and safety? |
Thanks for your careful thought on this. A few notes that come to mind.
I am probably misunderstanding what you mean, but I think job testability is excellent in Rails already and I'm not sure what adding a layer on top would bring to that.
The use case I have for arguments is in re-use of jobs. Consider: recurring_tasks:
- job: ReportEmailer
kwargs:
report: daily-usage
to: me@company.com
schedule: "every day at 8am"
- job: ReportEmailer
kwargs:
report: weekly-usage
to:
- me@company.com
- myboss@company.com
schedule: "every Monday at 8am" We do this both within applications for some jobs that are very similar and just need parameterized across schedulings, as well as with library code that is written and just needs slightly customized per application. The args and kwargs are all purely optional. There's no additional mental overhead if you write jobs that do not need parameterized. Personally I think this hits safe, simple, and flexible. Thanks again for considering the issue! |
@ianterrell Thanks for your feedback and for bringing up some really valid points. Let me address them: On Testability: You're right that job testability in Rails is already top-notch. My thought with adding a class method layer, like Regarding Arguments and Keyword Arguments: Your use case for reusing jobs with different parameters makes a lot of sense. Being able to specify arguments or kwargs in the config does add a lot of flexibility, allowing the same job to be tailored for different scenarios. That's a nice feature and it would be really handy to see the jobs like that in one place. Striking a Balance: I think my initial reaction was to the cron configuration example in the GoodJob docs. It struck me as maybe a little too feature rich. Seeing procs and lambdas alongside both and args and kwargs in a yaml file struck me as a bit of a foot-gun. Perhaps we can have a system that defaults to queuing jobs without arguments for straightforward tasks, while still offering the ability to specify args/kwargs for more complex scheduling needs. recurring_tasks:
- job: SimpleJob
schedule: every day at 9am
- job: ReportEmailerJob
kwargs:
report: daily-usage
to: me@company.com
schedule: every day at 8am
- job: ReportEmailerJob
kwargs:
report: weekly-usage
to:
- me@company.com
- myboss@company.com
schedule: every Monday at 8am |
I like this, yes. This is high priority for me. (I think that's what GoodJob is aiming for for the average case. The args/kwargs are totally optional and the doc shows an example omitting them with The
I don't personally have a use case for the lambda-implemented cron schedule they show in their |
Has there been any progress on this? I was thinking I might at least try and spike the problem on the weekend if not |
Not from me. I had spiked on a concept but got busy with work and that hasn't let up. I also think there is something to the suggestion that the work on unique jobs may be required first for this to work properly. |
Interesting. In most setups I've seen where there has been CRON scheduling jobs, there has been just one instance doing that scheduling so there has been no need for uniqueness. It might be a nice additional assurance, but shouldn't be critical for introducing this feature? |
I've made an initial implementation (not code tested or ready for review) of this which is very similar to how GoodJob handles it https://github.com/calebowens/solid_queue/tree/add-cron-tasks. There are some potential flaws with how this works in terms of cases when a thread may unexpectedly die, so I'm going to move to a polling-based system which will be more reliable and fit nicer into the |
Hey all! Thanks a lot for all the input here. I'm on-call this week and I've been working on other stuff the previous couple of weeks, so haven't had time to make any progress on this one, but I plan to come back to this next week. I'll review your approach then, @Caleb-T-Owens 🙏 About uniqueness, we want to allow people to run more than one instance of the dispatcher/scheduler process that handles this, so there must be a way to guarantee that either there's a single process in charge of it, or that the jobs are themselves unique. |
Hey @Caleb-T-Owens, thanks so much for putting that together, I've reviewed it, and it's pretty much the general idea I had about this. It's quite similar to my own PoC that I made back in December, I put it on pause to focus on publishing the rest and it's now quite outdated. I think it could make sense to have the dispatcher take care of dispatching the recurring tasks 🤔 Since it's possible to configure multiple dispatchers with different settings, perhaps one of them could have the recurring tasks if someone doesn't want the same dispatcher to do everything. I think I would also keep the task definition a bit simpler, just Another thing that I still haven't decided how to do but I'd like to have is a way to track which jobs correspond to a given cron entry/recurring task, for debugging purposes mostly, and to show it in Mission Control 🤔 |
Finally rescued my last attempt and completed it on #155. Going to test this in production for a bit before merging. |
A dispatcher would handle these, and the configuration will be included in
solid_queue.yml
.The text was updated successfully, but these errors were encountered: