Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: across step #29

Open
wants to merge 11 commits into
base: master
Choose a base branch
from
199 changes: 199 additions & 0 deletions 029-across-step/proposal.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,199 @@
# `across` step

This proposal introduces a mechanism by which a build plan can be executed
multiple times with different `((vars))`, in service of build matxies, pipeline
matrixes, and - building on [`((var))` sources][var-sources-rfc] - dynamic
variations of both.


## Motivation

* Support dynamic multi-branch workflows:
[concourse/concourse#1172][multi-branch-issue]
* Support build matrixes and pipeline matrixes.
* Replace a common, and very flawed, usage pattern of `version: every`, which
has proven to be very complicated to support.


## Proposal

The `across` step modifier is given a list containing vars and associated
values to set when executing the step. The step will be executed across all
combinations of var values.

### Static var values

The set of values can be static, as below:

```yaml
task: unit
vars: {go_version: ((.:go_version))}
vito marked this conversation as resolved.
Show resolved Hide resolved
across:
- var: go_version
values: [1.12, 1.13]
```

This will run the `unit` task twice: once with `go_version` set to `1.12`, and
again with `1.13`.

### Dynamic var values from a var source

Rather than static values, var values may be pulled from a var source
dynamically:

```yaml
var_sources:
- name: booklit-prs
type: github-prs
config:
repository: vito/booklit
access_token: # ...

plan:
- set_pipeline: pr
instance_vars: {pr_number: ((.:pr.number))}
across:
- var: pr
source: booklit-prs
Copy link

@aoldershaw aoldershaw Aug 6, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you thought about allowing the across step to iterate over objects from an arbitrary prototype (i.e. not just constrained to var_source prototypes and their list message)? That could make the object streams returned by prototypes more useful outside of the special prototype interfaces that Concourse will interact with natively (resources, var_sources, notifications, etc.)

In this comment concourse/concourse#5936 (comment), you suggested that the run step could possibly support a set_var field to store the returned object as a local variable. Perhaps an alternative to set_var would be to use the across step in its place, e.g.

- across:
  - var: some-var
    run: eval
    type: gval
    params: {fields: {some-field: hello}}
  run: echo
  type: debug
  params: {message: ((.:some-var.some-field))}

Kind of weird in this case since we're iterating over a single value always - but I guess for set_var to work in general, it would also have to set the var to a list of objects (even if a single object is emitted)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This also makes me realize that the across step can be used for conditional portions of build plans, by having the list of values be either 0 or 1 element long 🤔

Copy link

@aoldershaw aoldershaw Aug 6, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's also making me wonder about dynamic build plans in the context of the across step (i.e. the list of values to iterate over can be determined at runtime) - what are your thoughts on that? I think it could greatly extend the use cases that across can support, but it could possibly make pipelines difficult to understand since you could introduce arbitrarily complicated control flow

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have, and your comments are pretty much the same thought process I went thought. 😄 It seems like a really cool idea, but yeah, the difficulty is that it seems to require some sort of dynamic build plan construct.

I'm running into another area that would benefit from dynamic build plans: the way file: and image_resource: interact on a task step. I'm working on a refactor that pulls image fetching out into explicit check and get steps in the build plan, but the existence of file: means I can't know whether they're needed (Darwin and Windows have no images, for example), or what image to even check and get, until the build is running. In this case I can probably just preemptively configure steps which may just do no-ops, and they can probably get their config in a similar way to how the Get plan uses VersionFrom to get some config from a prior step, but it all feels kind of held together with duct tape. (There's also a bit of complexity in that we may need to check + get a custom type that the image_resource: uses, but I think have a way to get around that.)

I still don't see a super obvious way to implement dynamic build plans though. 🤔 But I haven't really tried, either, I guess.

Copy link

@aoldershaw aoldershaw Aug 6, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only thought I had was for there to be some sort of build event that says "inject this build plan under this plan ID" or "replace this plan ID with this build plan". So, the plan would initially look something like:

{
  "schema": "exec.v2",
  "plan": {
    "id": "across-id",
    "across": {}
  }
}

and there could be a build event like:

{
  "event":"dynamic-build-plan",
  "version":"1.0",
  "data": {
    "id": "across-id",
    "across": {
      "some": "data"
    }
  }
}

which the UI would interpret to construct a final build plan of:

{
  "schema": "exec.v2",
  "plan": {
    "id": "across-id",
    "across": {
      "some": "data"
    }
  }
}

I'm sure there are some technical challenges I haven't considered, though (/it may not work at all!)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That seems close to what I'm proposing - whatever the step results in gets used as the value. I think it'd just look like this with the syntax I had before:

var: foo
do:
- get: data
  trigger: true
- task: filter-list
  file: ci/tasks/filter-list.yml
  input_mapping: {input: data}
  output_mapping: {output: filtered}
- load_var: filtered
  file: filtered/data.json

...but this is just a different syntax. values_from is probably clearer. I think we'd still need to handle values: ((.:foo)), but that could probably be handled by the across step too instead of a separate values: step.

In either case, I'm not sure how the across step would access the result in the case of a do. 🤔 Currently results are stored under step IDs stored in the RunState, but this is almost like a return value from an expression. With a single step it's obvious what step ID to use, but with a do it's not so easy. It's almost like we need (exec.Step).Run to be able to return a value on its own, and do would return the last value. But that changes the interface.

Maybe this whole storing-results-in-RunState pattern could/should change? @clarafu @chenbh Has your recent experience in this area led to a similar line of thinking?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, also re: get_vars vs list_vars, yeah depending on what the resulting value is list_vars might be more accurate. I think either get_vars would actually fetch all the var values individually, using a list call under the hood for the scheduling/triggering, or we would have list_vars instead and allow the user to run a separate get_var step for each var value if that's desired. Don't know which; haven't put much thought into it yet.

Copy link

@aoldershaw aoldershaw Feb 24, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I was proposing is a bit more complicated than just a multi-step values_from (where the value comes from the last step), but I didn't have the proper framing. Now, I realize it the proposal was actually a combination of two ideas:

  1. A new way to communicate between jobs by emitting/receiving data
  2. Anonymous jobs (run nested in other jobs)

I'll flesh out my original example a bit more:

jobs:
- name: parent
  plan:
  - get: ci
  - across:
    - var: foo
      values_from:
        do:  # <----------------------------- A
        - get: data
          trigger: true # <------------------ B
        - task: filter-list
          file: ci/tasks/filter-list.yml
          input_mapping: {input: data}
          output_mapping: {output: filtered}
        - load_var: filtered # <------------- C
          file: filtered/data.json
    ...

Here, A is an "anonymous job". When the trigger on B gets a new version, the parent job doesn't trigger - only the anonymous job A. The load_var C implicitly emits the value it loads, and parent implicitly receives and triggers on changes to this value.

The implicit emitting and receiving is very non-obvious, though, and the anonymous job idea is a bit odd (either we make it fully self-contained and need to duplicate the get: ci, or it magically can borrow artifacts from the parent job). What if instead we made it explicit and gave the filtering it's own job:

jobs:
- name: filter-list
  plan:
  - get: ci
  - get: data
    trigger: true
  - task: filter-list
    file: ci/tasks/filter-list.yml
    input_mapping: {input: data}
    output_mapping: {output: filtered}
  - load_var: filtered
    file: filtered/data.json
    emit: data # <--------------------- A

- name: parent
  plan:
  - get: ci
  - receive: data # <------------------ B
    job: filter-list
    trigger: true
  - across:
    - var: foo
      values: ((.:data))
    ...

A emits a new value and stores it as data. B triggers on changes to that emitted value and stores the result in a local variable.

This mechanism could even be used for {get_var: ..., trigger: true} - the builds we run periodically to fetch the var could emit the value to the DB, and the job would receive that changed value from the build.

EDIT:

This could even possibly be used for {get: ..., trigger: true} - the periodic check builds could emit the latest version, and the generated plan could effectively be:

- receive: version # from the check build, somehow
  trigger: true
- get: version

Wouldn't work with version: every or pinned/disabled versions, though, so doesn't make much sense

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In either case, I'm not sure how the across step would access the result in the case of a do. 🤔 Currently results are stored under step IDs stored in the RunState, but this is almost like a return value from an expression. With a single step it's obvious what step ID to use, but with a do it's not so easy. It's almost like we need (exec.Step).Run to be able to return a value on its own, and do would return the last value. But that changes the interface.

I thought of it like every type of step would have a single output (e.g. check is version, get is image spec, load_var and get_var are var), in do's case maybe we just need to figure out the contract for what it should return (is it time to introduce a result step/modifier)?

It would then be up to across to figure out what to do with this info (which might be wild cause that implies we can run across a list of versions from a check).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure if I am adding anything useful to the conversation, but when I saw the the 7.0.0 release notes I immediately saw a good use case for one of our pipelines and I tried to build a working prototype using pipeline instances and the across step.

- name: set-pipeline-instances
  serial: true
  plan:
    - in_parallel:
      - get: ci
      - get: src-branches
        trigger: true
    ...
    - load_var: branches
      file: branches-json/branches.json
    - set_pipeline: {{ (datasource "config").name }}
      file: ci/ci/build.yml
      instance_vars: {branch: ((.:b))}
      across:
        - var: b
          values: ((.:branches))

I am keen for the normal resource type to trigger the job and then use the load_var to load a variable that is a list and then just use that for for the argument to values.

```

The above example will run the `set_pipeline` step across the set of all GitHub
PRs, returned through a `list` operation on the `booklit-prs` var source.
vito marked this conversation as resolved.
Show resolved Hide resolved

### Running across a matrix of var values

Multiple vars may be listed to form a matrix:

```yaml
set_pipeline: pr-go
instance_vars:
pr_number: ((.:pr.number))
go_version: ((.:go_version))
across:
- var: pr
source: booklit-prs
- var: go_version
values: [1.12, 1.13]
```

This will run 2 * (# of PRs) `set_pipeline` steps, setting two pipelines per
PR: one for Go 1.12, and one for Go 1.13.

### Controlling parallelism with `max_in_flight`
vito marked this conversation as resolved.
Show resolved Hide resolved

By default, the steps are executed serially to prevent surprising load
increases from a dynamic var source suddenly returning a ton of values.

To run steps in parallel, a `max_in_flight` must be specified as either `all`
or a number - its default is `1`. Note: this value is specified on each `var`,
rather than the entire step.

With `max_in_flight: all`, no limit on parallelism will be enforced. This would
be typical for when a small, static set of values is specified, and it would be
annoying to keep the number in sync with the set:

```yaml
task: unit
vars: {go-version: ((.:go-version))}
across:
- var: go-version
values: [1.12, 1.13]
max_in_flight: all
```

With `max_in_flight: 3`, a maximum of 3 var values would be executed in
parallel. This would be typically set for values coming from a var source,
which may change at any time, or especially large static values.

```yaml
set_pipeline: pr
instance_vars: {pr_number: ((.:pr.number))}
across:
- var: pr
source: booklit-prs
max_in_flight: 3
```

When multiple `max_in_flight` values are configured, they are multiplicative,
building on the concurrency of previously listed vars:

```yaml
set_pipeline: pr
instance_vars:
pr_number: ((.:pr.number))
go_version: ((.:go_version))
across:
- var: pr
source: booklit-prs
max_in_flight: 3
- var: go_version
values: [1.12, 1.13]
max_in_flight: all
```

This will run 6 `set_pipeline` steps at a time, focusing on 3 PRs and setting
Go 1.12 and Go 1.13 pipelines for each in parallel.

Note that setting a `max_in_flight` on a single `var` while leaving the rest as
their default (`1`) effectively sets an overall max-in-flight.

### Triggering on changes

With `trigger: true` configured on a var, the build plan will run on any change
to the set of vars - i.e. when a var value is added, removed, or changes.

```yaml
var_sources:
- name: booklit-prs
type: github-prs
config:
repository: vito/booklit
access_token: # ...

plan:
- set_pipeline: pr
instance_vars: {pr_number: ((.:pr.number))}
across:
- var: pr
source: booklit-prs
trigger: true
```

Note that this can be applied to either static `values:` or dynamic vars from
`source:` - both cases just boil down to a comparison against the previous
build's set of values.

## Open Questions

* n/a
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JohannesRudolph Sorry, completely forgot to address your point! (Raising this as a review comment so the replies can be threaded.)

Would it be sufficient to allow a static list of combinations to skip to be provided?

across:
- var: go
  values: [1.16, 1.15, 1.8]
- var: platform
  values: [darwin, linux, windows]
except:
- platform: windows
  go: 1.18

This would work similarly to excluding jobs in Travis CI, i.e. a subset of vars may be specified, and all variations with those vars will be skipped.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Related thought: it would be kind of cool if we could except combinations based on a relationship between variables. e.g. if you want to test various upgrade paths, you only want to test combinations of from and to where to > from. I have no idea how this would work - it might just be a use case for var sources rather than except

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something that didn't really come up in the proposal is how we handle steps that emit artifacts (e.g. get, task.outputs, prototype outputs). Currently, the implementation of the static across step runs each iteration in its own local artifact scope. This means that any artifacts go out of scope past the across step and cannot be used.

It'd be cool if there was a way to "fan-in" on artifacts emitted within an across step. For instance, suppose there was a prototype (or just a task) for compiling a Go binary that outputs an artifact called binary. You might want to compile it for multiple platforms/architectures, which could make use of the across step:

across:
- var: platform
  values: [linux, darwin, windows]
- var: arch
  values: [amd64, arm64]
run: compile
type: go
params:
  package: repo/cmd/my-program
  platform: ((.:platform))
  arch: ((.:arch))
outputs: [binary]

(side-note: it might make more sense from a performance PoV for a go prototype to support compiling multiple platforms/architectures in a single run step where possible, but bear with me for the example)

The trouble is - under the current implementation, there's no way to access binary outside of the across step. Even if we used a shared artifact scope for each iteration, it'd just be "last write wins" (so, with no parallelism, binary would be the result for (windows, arm64) only) - since they all share a name, the results would clobber each other. This could be avoided by encoding more information in the output name via an output_mapping (e.g. name the artifacts binary-linux-amd64, binary-linux-arm64, ...) - but that only works for simple values like strings/numbers.

So, what if we let artifacts in an across step share a name, but be uniquely identified by the set of vars that were used to produce that artifact (in the same way that instanced pipelines share a name and are uniquely identified by their instance_vars). A possible syntax for referencing the binary created for (linux, amd64) could be binary{platform: linux, arch: amd64}

Here, binary really refers to a matrix of outputs (that mirrors the across matrix). What if we also provided a way to filter down that matrix to get at a subset of the outputs. e.g. binary{platform: windows} would give the (2) outputs built for windows, and binary{arch: arm64} would give the (3) outputs built for arm64.

So, suppose you wanted to upload a github release containing all 6 binaries produced in the matrix. You could do something like:

put: release
params:
  globs: [binary/*] # or maybe [binary{*}/*] to make it clear it's a matrix

Suppose you wanted to bundle the different architectures separately - then, you could do:

across:
- var: arch
  values: [amd64, arm64]
put: ((.:arch))-bundle
params:
  globs: [binary{arch: ((.:arch))}/*]

This may warrant a separate proposal but figured I'd bring it up here since it's definitely a limitation of the across step as it stands

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One idea, maybe we could re-use the concept of output_mappings. Similar to how a task defined in a file can have its outputs mapped by an output_mappings configuration in the pipeline referencing that task, an across step could have its outputs made available via a similar mapping.

As an example:

across:
- var: arch
  values: [amd64, arm64]
- var: version
  values: [1.0, 1.1]

task: create-artifact
output_mapping:
  output: task-artifact

across_output_mapping:
- vars: # match vars from an execution of the task
    arch: amd64
    version: 1.0
  outputs: # map an output from the modified step to an output of the across step
    task-artifact: amd64-bundle-v1.0

This would be a fairly verbose if you wanted to map every output from the matrix of steps:

# [...]
across_output_mapping:
- vars: {arch: amd64, version: 1.0}
  outputs: {task-artifact: amd64-bundle-v1.0}
- vars: {arch: arm64, version: 1.0}
  outputs: {task-artifact: arm64-bundle-v1.0}
- vars: {arch: amd64, version: 1.1}
  outputs: {task-artifact: amd64-bundle-v1.1}
- vars: {arch: arm64, version: 1.1}
  outputs: {task-artifact: arm64-bundle-v1.1}

Though it would possibly simpler to implement, without relying on prototypes or other Concourse functionality.

When, or if, this limitation on using vars in get: and put: steps (eg. get: bundle-((.:arch))-((.:version))) is lifted, the across_output_mapping could be updated to allow for the use of vars as well, simplifying the configuration to something like...

# [...]
across_output_mapping:
- outputs:
    task-artifact: ((.:arch))-bundle-v((.:version))



## New Implications

* Using `across` with var sources implies the addition of a `list` action for
listing the vars from a var source. We could build on this to show the list
of available vars in the UI, which would really help with troubleshooting
credential manager access and knowing what vars are available.

Obviously we wouldn't want to show credential values, so `list` should only
include safe things like credential paths.

* Combining the [`set_pipeline` step][set-pipeline-rfc], [`((var))`
sources][var-sources-rfc], [instanced pipelines][instanced-pipelines-rfc],
[pipeline archiving][pipeline-archiving-rfc], and the `across` step can be
used for end-to-end automation of pipelines for branches and pull requests:

```yaml
set_pipeline: pr
instance_vars: {pr_number: ((.:pr.number))}
across:
- var: pr
source: prs
trigger: true
```

[set-pipeline-rfc]: https://github.com/concourse/rfcs/pull/31
[instanced-pipelines-rfc]: https://github.com/concourse/rfcs/pull/34
[pipeline-archiving-rfc]: https://github.com/concourse/rfcs/pull/33
[var-sources-rfc]: https://github.com/concourse/rfcs/pull/39
[multi-branch-issue]: https://github.com/concourse/concourse/issues/1172