Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Support for scheduled Workflows with CronWorkflow CRD #1758

Merged
merged 53 commits into from
Jan 7, 2020

Conversation

simster7
Copy link
Member

@simster7 simster7 commented Nov 12, 2019

Closes: #1229

Design

Implements a CronWorkflow CRD that defines a scheduled Workflow. The design is such that any Workflow can be simply converted to a CronWorkflow by replacing kind: Workflow with kind: CronWorkflow and adding some CronWorkflowOptions (in short: CronWorkflow = Workflow + CronWorkflowOptions).

Example:

apiVersion: argoproj.io/v1alpha1
kind: CronWorkflow
metadata:
  name: test-cron-wf
spec:
  schedule: "* * * * *"
  concurrencyPolicy: "Replace"
  startingDeadlineSeconds: 0
  workflowSpec:
    entrypoint: whalesay
    templates:
    - name: whalesay
      container:
        image: alpine:3.6
        command: [sh, -c]
        args: ["date; sleep 90"]

Implementation

Implemented by creating a new controller that will be embedded within workflow-controller (i.e. the user will not need to install any separate controllers). The cron controller will monitor all CronWorkflow resources and will execute and terminate Workflows based on their respective CronWorkflowOptions.

Features/Vision:

  • Virtually no conversion to set up a new CronWorkflow
  • Works will all K8s CronJob options
  • Works with WorkflowTemplates

This is still a WIP. TODO:

  • Finish implementing K8s CronJob options:
    • schedule
    • startingDeadlineSeconds
    • concurrencyPolicy
    • suspend
    • successfulJobsHistoryLimit
    • failedJobsHistoryLimit
  • Support exhaustive logging and event recording under the CronWorkflow CRD (i.e. make kubectl describe cronwf [...] useful)
  • Write argo CLI integration
  • Add CronWorkflow Validation
  • Testing

@simster7 simster7 requested a review from alexec January 3, 2020 19:36
@simster7
Copy link
Member Author

simster7 commented Jan 3, 2020

@alexec Added tests and enhanced the e2e suite, please review

@alexec
Copy link
Contributor

alexec commented Jan 4, 2020

Test failure:

workflow/cron/operator.go:84:10: Errorf format %w has unknown verb w

Copy link
Contributor

@alexec alexec left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Conditional approval - can you fix the broken test?

test/e2e/cron/cron_test.go Outdated Show resolved Hide resolved
CronWorkflow("@testdata/basic-forbid.yaml").
When().
CreateCronWorkflow().
Wait(2 * time.Minute).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

anyway to reduce this time to <30s?

Copy link
Member Author

@simster7 simster7 Jan 6, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, I don't think so. I need at least two cron cycles for this test, the least significant of which is a minute. I could mock the cron framework to run -- say -- every 15 seconds, but since we are dealing with k8s objects I worry that the short time might cause these tests to be flakey. To mediate the effects of this I decided to run these particular tests in parallel, so the entire cron test suite should take at most 3 minutes.

Let me know what you think.

test/e2e/cron/cron_test.go Show resolved Hide resolved
test/e2e/cron/cron_test.go Show resolved Hide resolved
test/e2e/cron/cron_test.go Show resolved Hide resolved
test/e2e/cron/cron_test.go Show resolved Hide resolved
test/e2e/cron/cron_test.go Outdated Show resolved Hide resolved
test/e2e/fixtures/when.go Show resolved Hide resolved
@@ -62,6 +85,14 @@ func (w *When) WaitForWorkflow(timeout time.Duration) *When {
}
}

func (w *When) Wait(timeout time.Duration) *When {
logCtx := log.WithFields(log.Fields{"test": w.t.Name(), "cron workflow": w.cronWorkflowName})
logCtx.Infof("Waiting for %s", humanize.Duration(timeout))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would it be possible for this to wait until a condition is met and the exit (unless it timesout) - that way test might be faster

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I intended this to be a general wait function, I think it could potentially be useful as is in the future.

W.r.t. the cron tests, see #1758 (comment)

workflow/common/util.go Outdated Show resolved Hide resolved
@simster7
Copy link
Member Author

simster7 commented Jan 6, 2020

Test failure:
workflow/cron/operator.go:84:10: Errorf format %w has unknown verb w

Will fix.

Seems like E2E tests are not excluded when running make test, will fix this

@codecov

This comment was marked as resolved.

@alexec
Copy link
Contributor

alexec commented Jan 6, 2020

I don't actually see cron test running on CI. I think you need a block similar to this:

https://github.com/argoproj/argo/blob/83ae2df4130468a95b720ce33c9b9e27e7005b17/test/e2e/smoke_test.go#L42

@simster7 simster7 merged commit 7e9b2b5 into argoproj:master Jan 7, 2020
@rdigiorgio
Copy link
Contributor

So, I guess timezone configuration is, like for CronJob spec, not going to be supported ? :(

@evaneaston
Copy link

evaneaston commented Jan 10, 2020

@evaneaston There is currently no plan to provide time zone support. Do you have a use case where you think it would be useful?

I though I had replied, but clearly I never submitted it. And since there was another comment about timezone support added today by @rdigiorgio, I'll include what I wanted to say earlier:

Use Case: Being able to schedule something periodically during, say business hours M-F using simple cron ranges when your business hours stradle UTC midnight. The patterns would be much simpler if you could evaluate them in the context of a specific timezone

That said, it's pretty straightforward schedule a workflow to run every period, but start with a step workflow whose sole job can checks if the time is in a local timezone window and whose result is a conditional for continuing the workflow.

It's not as efficient as being able to evaluate cron expressions in the context of a timezone, and results in a lot of noop workflows, but it works, as a workaround.

@rdigiorgio
Copy link
Contributor

rdigiorgio commented Jan 10, 2020

Indeed, we can find plenty of workarounds, but it seems to be cumbersome and moreover, playing with infrastructure that can scale from 0 to X, it is sad to say "Ok let's run the workflow, wait for a node to span (GKE autoscale feature), the first step says that we should not go forward and abort the workflow, wait for the cluster to scale down" = pay extra bucks because CRON does not support timezones.

By the way, I am nowhere near to know how "hard" it would be to implement on CronWorkflows, but I really thought this new feature would come bundled with timezones support and I would be able to drop all those K8S CronJobs

@evaneaston
Copy link

evaneaston commented Jan 10, 2020

@rdigiorgio Agreed.

I was hoping that since argo-events supported timezone in the calendar event source, argo CronWorkflows might too.

It looks like robfig/cron v3 now supports timezones. So perhaps implementing support for timezones isn't too hard to add...not that I have plans to take up go and implement the change.

@simster7
Copy link
Member Author

No reason timezone support can't be added in the future :) Opened to #1931 track this

@LucaSoato
Copy link

Sorry if the question seems stupid, but how does the scheduler work? Is there a single point of failure, or is it stable enough to keep running workflows even if a node goes down?

@simster7
Copy link
Member Author

The scheduler will be running as long as the workflow-controller is running. I.e., if you can run workflows, then the scheduler should be running. If the workflow-controller goes down at some point, once it restarts the scheduler will run any missed CronWorkflows as specified per activeDeadlineSeconds

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Execute Workflow as CRON
10 participants