drop step.group in favour of depends_on to create a DAG #1860

6543 · 2023-06-13T13:06:50Z

Support parallelism inside a pipeline via a DAG. This requires a way to define step dependencies which can then allow the runner to build said DAG. Edit: I see group provides something similar for static fan-out, but certainly not as flexible as defining dependencies directly.

Originally posted by @silverwind in #567 (comment)

The text was updated successfully, but these errors were encountered:

6543 · 2023-06-13T13:08:18Z

what about depends_on we already use it for workflow dep.-DAG ?

lafriks · 2023-06-13T19:48:18Z

Yes, same naming as for pipelines would be the best way to go imho

6543 · 2023-06-13T22:58:01Z

one big downside! and what esp. makes it breaking!!!:

now we assume that steps need to be run in sequence if not told otherwise (because it is what you normally want)
if we change this ... it would be the exact other way around

so what I think is really important: by default it should run sequential (that's just good UX-Design)

so what about a workflow config: parallel: false that if enabled would create a step dag ?!?

6543 · 2023-06-13T23:02:26Z

so I propose:

pipeline/backend types (docker/k8s/...) -> always use a DAG
pipeline/frontend types (yaml) -> use a dag if specified and if not it creates a simpjle dag
-> (step1->step2->(group:step3 & step 4)-> step5...) witch could still encounter the groups feature...

that way it also would be non breaking

6543 · 2023-06-13T23:09:59Z

we should explicite write that down as one of our core values:

by default the config should work intuetive for beginners and not for advanced users - KISS &
the default should be the so that the normal pipelines dont need to configure almost anything

silverwind · 2023-06-14T17:17:31Z

In drone, it is also sequential until you add the first depends_on to a step, at which point it becomes a DAG. So in a sense, it's compatible.

6543 · 2023-06-14T20:58:34Z

That sounds nice as long as you dont have a workflow with steps that all can run paralel

silverwind · 2023-06-14T21:01:03Z

Right, how about depends_on: [] on all steps to indicate they have no dependencies and can run immediately? Could also name the property dependencies, which I favor slightly because they are called the same in the React world.

6543 · 2023-06-14T21:56:04Z

I would call that not easy to lint, parse and sanitize ...

Also it would increase yaml config more ...

The more i think and talk about this with others the more i think we should have a workflow flag "parallel: true" and depends_on ... for the DAG

lumpi101 · 2023-06-17T09:57:31Z

One more idea: to execute steps in parallel or not can be decided on whether the steps in the yaml are defined as a list or a dictionary.

A list is sequential by nature and thus means that all listed steps are executed sequentially in that order.
A dictionary has no ordering (also not in the yaml-specs afaik) and thus means that all steps are run in parallel, constrained by the "depends_on" markers.

This is of course not backwards compatible und would require some versioning of the pipeline syntax.
Also, having a dictionary, step names cannot be omitted which makes some kind of sense because autonaming steps with some enumeration could pretend some ordering which is not the case then.

A modified example from the docs...

parallel execution with DAG:

 version: 2
 steps:
   backend:  # <-- step names are mandatory
     image: golang
     commands:
       - go build
       - go test
   frontend:
     image: node
     commands:
       - npm install
       - npm run test
       - npm run build
   publish:
     image: plugins/docker
     depends_on: [backend, frontend]  # <-- if we would omit this, all 3 steps would be executed in parallel
     repo: octocat/hello-world

sequential execution:

 steps:
   - name: backend  # <-- step names are optional (afaik)
     image: golang
     commands:
       - go build
       - go test
   - name: frontend
     image: node
     commands:
       - npm install
       - npm run test
       - npm run build
   - name: publish
     image: plugins/docker
     repo: octocat/hello-world

anbraten · 2023-06-19T17:28:54Z

In drone, it is also sequential until you add the first depends_on to a step, at which point it becomes a DAG. So in a sense, it's compatible.

This sounds like the best option to me atm.

ghost91- · 2023-06-20T00:38:52Z

In drone, it is also sequential until you add the first depends_on to a step, at which point it becomes a DAG. So in a sense, it's compatible.

This sounds like the best option to me atm.

That’s also more or less, how it works in GitLab CI: By default, everything is sequential, and jobs in the same stage are executed in parallel (that basically matches woodpecker's group). But as soon as you define the needs property for a job, the DAG handling kicks in, and the job is executed outside of the default order, as soon as all jobs listed in needs have completed. In particular, if you specify needs: [], the job starts immediately, regardless of its position.

See https://docs.gitlab.com/ee/ci/yaml/index.html#needs.

So I think this would be a pretty good pattern, since both users of GitLab CI and Drone are already familiar with it.

6543 · 2023-06-20T05:20:17Z

I personally think needs: [] is bad to de-serialize ... from yaml perspective

see: https://go.dev/play/p/JziUknAwKvY

and yes I know there is a golang trick to achive it ... -> https://go.dev/play/p/M1XlwhSUZn_X

but yaml do exist outside of golang to and each parser do behave bit different if it belongs to this edgecases

ghost91- · 2023-06-20T06:41:49Z

I personally think needs: [] is bad to de-serialize ... from yaml perspective

see: https://go.dev/play/p/JziUknAwKvY

and yes I know there is a golang trick to achive it ... -> https://go.dev/play/p/M1XlwhSUZn_X

but yaml do exist outside of golang to and each parser do behave bit different if it belongs to this edgecases

To be honest, I think it’s weird that parsers would not implement this in a way so that combining deserialisation and serialisation results in the identity. There is obviously a difference between a property being empty, and it not existing at all in yaml, so parsers should respect that, imo.

Doing it the other way round, as you suggest, and making the "empty array" case and the "not set" case behave equally, is extremely surprising from a user perspective: Users could still write depends_on: [], but I suspect that hardly anybody would interpret this as "has the default behavior, i.e., actually depends on all previously defined steps", in particular because it has a different meaning in other (but similar) contexts (Drone, GitLab CI).

By the way, GitHub actions circumvents this problem by executing jobs in parallel, by default, instead of sequentially:

A workflow run is made up of one or more jobs, which run in parallel by default. To run jobs sequentially, you can define dependencies on other jobs using the jobs.<job_id>.needs keyword.

(https://docs.github.com/en/actions/using-workflows/workflow-syntax-for-github-actions#jobs)

So the empty array case and and the property missing case do have the same meaning semantically. But not sure if it would be worth a breaking change to change that default in woodpecker…

6543 added pipeline-config related to pipeline config files (.yaml) breaking will break existing installations if no manual action happens labels Jun 13, 2023

6543 mentioned this issue Jun 13, 2023

RFC: Pipeline config #567

Closed

7 tasks

6543 mentioned this issue Jun 13, 2023

Add Changelog for v1.0.0 RC1 #1854

Merged

6543 changed the title ~~drop step.group in favour of need~~ drop step.group in favour of depends_on to create a DAG Jun 14, 2023

6543 added this to the 1.1.0 milestone Jun 19, 2023

6543 mentioned this issue Jun 23, 2023

Document key-priorieties for development #1884

Closed

6543 mentioned this issue Aug 9, 2023

WIP: Represent step dependency as dag in backend #2184

Closed

pat-s modified the milestones: 2.0.0, 2.x.x Oct 13, 2023

anbraten mentioned this issue Nov 6, 2023

Add depends_on support for steps #2771

Merged

6 tasks

anbraten closed this as completed in #2771 Dec 24, 2023

qwerty287 modified the milestones: 3.x.x, 2.1.0 Dec 25, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

drop step.group in favour of depends_on to create a DAG #1860

drop step.group in favour of depends_on to create a DAG #1860

6543 commented Jun 13, 2023

6543 commented Jun 13, 2023

lafriks commented Jun 13, 2023

6543 commented Jun 13, 2023

6543 commented Jun 13, 2023 •

edited

Loading

6543 commented Jun 13, 2023

silverwind commented Jun 14, 2023 •

edited

Loading

6543 commented Jun 14, 2023

silverwind commented Jun 14, 2023 •

edited

Loading

6543 commented Jun 14, 2023

lumpi101 commented Jun 17, 2023

anbraten commented Jun 19, 2023

ghost91- commented Jun 20, 2023 •

edited

Loading

6543 commented Jun 20, 2023

ghost91- commented Jun 20, 2023 •

edited

Loading

drop step.group in favour of depends_on to create a DAG #1860

drop step.group in favour of depends_on to create a DAG #1860

Comments

6543 commented Jun 13, 2023

6543 commented Jun 13, 2023

lafriks commented Jun 13, 2023

6543 commented Jun 13, 2023

6543 commented Jun 13, 2023 • edited Loading

6543 commented Jun 13, 2023

silverwind commented Jun 14, 2023 • edited Loading

6543 commented Jun 14, 2023

silverwind commented Jun 14, 2023 • edited Loading

6543 commented Jun 14, 2023

lumpi101 commented Jun 17, 2023

anbraten commented Jun 19, 2023

ghost91- commented Jun 20, 2023 • edited Loading

6543 commented Jun 20, 2023

ghost91- commented Jun 20, 2023 • edited Loading

6543 commented Jun 13, 2023 •

edited

Loading

silverwind commented Jun 14, 2023 •

edited

Loading

silverwind commented Jun 14, 2023 •

edited

Loading

ghost91- commented Jun 20, 2023 •

edited

Loading

ghost91- commented Jun 20, 2023 •

edited

Loading