Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Trusted publishing: Support for GitHub reusable workflows #11096

Open
woodruffw opened this issue Apr 1, 2022 · 23 comments
Open

Trusted publishing: Support for GitHub reusable workflows #11096

woodruffw opened this issue Apr 1, 2022 · 23 comments
Assignees
Labels
feature request security Security-related issues and pull requests trusted-publishing

Comments

@woodruffw
Copy link
Member

Our MVP implementation (#10753) assumes that the workflow is in the same repository, which is not necessarily true.

We should support reusable workflows, specifically via the job_workflow_ref claim.

@woodruffw
Copy link
Member Author

Just as a note: this will have to interoperate with #11263.

@GergelyKalmar
Copy link

Any chance to expedite this issue? It seems there's many who would be eager to move over to trusted publishing now that it is the recommended approach, but not being able to do the flow from a reusable workflow is a pretty major blocker. This issue also does not seem to be documented anywhere so we only find out about it once we changed everything to use the new approach.

@woodruffw
Copy link
Member Author

Any chance to expedite this issue?

I can't personally promise a timeline here, but it is on my backlog of things to do.

This issue also does not seem to be documented anywhere so we only find out about it once we changed everything to use the new approach.

Thank you for pointing this out; I'll make some changes to the docs to emphasize that reusable workflows are not currently supported.

@woodruffw
Copy link
Member Author

Thank you for pointing this out; I'll make some changes to the docs to emphasize that reusable workflows are not currently supported.

I've opened #13592 for this. Thanks again for bringing it to our attention!

@di di changed the title Support for OIDC publishing from reusable workflows Trusted publishing: Support for GitHub reusable workflows Jun 7, 2023
mhils added a commit to mitmproxy/mitmproxy_rs that referenced this issue Jun 13, 2023
mhils added a commit to mhils/workflows that referenced this issue Jun 13, 2023
h3mmy added a commit to tyg3rr/Epi_Helper that referenced this issue Jun 25, 2023
h3mmy added a commit to tyg3rr/Epi_Helper that referenced this issue Jun 25, 2023
Switch to using API Token pending pypi/warehouse#11096
@webknjaz
Copy link
Member

@woodruffw I just stumbled upon an example of a reusable workflow actually working with OIDC here jorisroovers/gitlint#486 — could you confirm this is because of secrets: inherit at https://github.com/jorisroovers/gitlint/blob/5247839/.github/workflows/ci.yml#L225 ?

@GergelyKalmar
Copy link

GergelyKalmar commented Jun 29, 2023

That workflow is in the same repository though, I think the issue is with reusable workflows that are in a different repository (e.g. in a separate public repo).

@woodruffw
Copy link
Member Author

Yeah, I think that's because it's in the same repo, not because of secrets: inherit.

In other words: reusable workflows do work with the current implementation, just not reusable workflows that are external to the repository that the trusted publisher was configured with.

@webknjaz
Copy link
Member

That workflow is in the same repository though, I think the issue is with reusable workflows that are in a different repository (e.g. in a separate public repo).

Yes, you can't use inherit across orgs. The issue was that the OIDC context has a reusable workflow reference and that's currently not checked IIUC. So my concern is that it might be possible to fool the verification logic in PyPI with this...

@GergelyKalmar
Copy link

You can get around the inherit issue between organizations via forking, which is what we would usually do. So same org, different repo would be fine. The point of reusable workflows in this case would be that you store them in a central repository and you can re-use them in other repositories.

anoto-moniz added a commit to CitrineInformatics/citrine-python that referenced this issue May 15, 2024
The current version of the PyPI GitHub Action deploy workflow assumes
the calling workflow is in the repo to be deployed when using Trusted
Publishing: pypi/warehouse#11096

As such, putting the deploy workflow in a shared location is not
supported, and leads to errors.
anoto-moniz added a commit to CitrineInformatics/citrine-python that referenced this issue May 15, 2024
The current version of the PyPI GitHub Action deploy workflow assumes
the calling workflow is in the repo to be deployed when using Trusted
Publishing: pypi/warehouse#11096

As such, putting the deploy workflow in a shared location is not
supported, and leads to errors.
anoto-moniz added a commit to CitrineInformatics/citrine-python that referenced this issue May 15, 2024
The current version of the PyPI GitHub Action deploy workflow assumes
the calling workflow is in the repo to be deployed when using Trusted
Publishing: pypi/warehouse#11096

As such, putting the deploy workflow in a shared location is not
supported, and leads to errors.
anoto-moniz added a commit to CitrineInformatics/common-gh-actions that referenced this issue May 15, 2024
The deploy workflow cannot be shared because the PyPI GitHub Action
deploy workflow cannot be shared when using Trusted Publishers.

See pypi/warehouse#11096
@facutuesca
Copy link
Contributor

facutuesca commented Jul 31, 2024

I've been looking into how we can implement this. I've written up the relevant facts about how GitHub and PyPI currently handle OIDC claims, and at the end I specify what adding support for reusable workflows would look like in light of those facts.

GitHub OIDC claims with (non-)reusable workflows

  1. Reusable workflows are workflows that can be called from another workflow
  2. There can be up to 4 levels of connected workflows, i.e: top-level-workflow.yml -> reusable-workflow-1.yml -> reusable-workflow-2.yml -> reusable-workflow-3.yml.
  3. These reusable workflows can be located in external repositories.
  4. When GitHub generates an OIDC token, the token will contain claims about the workflow that requested the token.
  5. When it's a reusable workflow, the claims will only include information about top-level-workflow.yml and reusable-workflow-3.yml. That is, only about the top-level and bottom-level workflows.
  6. The claims that contain the ref paths for those two workflows are workflow_ref and job_workflow_ref (top-level and bottom-level respectively). For example:
    {
        "workflow_ref": "org/repo/.github/workflows/top-level-workflow.yml@refs/heads/main",
        "job_workflow_ref": "org2/repo2/.github/workflows/reusable-workflow-3.yml@v1",
    }
  7. When the token is requested by a non-reusable workflow (a single workflow that doesn't call other workflows) those two claims will be equal, containing the path ref of that single workflow.
  8. There are two more claims for the commit hash of the workflows in the previous claims:
    {
        "workflow_sha": "c2698398906ab70572c7878a11debb44ff1d2241",
        "job_workflow_sha": "10040c56a8c0253d69db7c1f26a0d227275512e2",
    }
  9. There is one claim that contains the ref (e.g: refs/heads/main) corresponding to the top-level workflow, but not for the bottom-level workflow:
    {
        "ref": "refs/heads/main",
    }

PyPI verification of OIDC claims for GitHub workflows

  1. The user creates a Trusted Publisher by specifying a repository owner, a repository name, and a workflow filename (and optionally an environment, which we'll ignore for this discussion since it's not relevant).

  2. When PyPI receives an OIDC token, it checks its job_workflow_ref claim against the fields mentioned above. For example:

    {
        "job_workflow_ref": "org/repo/.github/workflows/release.yml@refs/heads/main",
    }

    will pass validation for a Trusted Publisher configured with:

    repository_owner = "org"
    repository_name = "repo"
    workflow_filename = "release.yml"
  3. PyPI will also check that the ref in jobs_workflow_ref (in the example above, refs/heads/main) matches the ref claim, which we saw in (9) corresponds to the top-level workflow.

Note how our checks in (11) and (12) mix claims about the bottom-level workflow (jobs_workflow_ref) with claims about the top-level workflow (ref). This is fine for non-reusable workflows, since the claims refer all to the same, single workflow. However, it's problematic when starting to consider reusable workflows.

Note also that we accidentally support a subset of reusable workflows: when the top-level and bottom-level workflow are located in the same repo, and the bottom-level workflow is called in the same branch/tag/sha as the top-level one, then the OIDC claims check will pass. Concretely, if the claims look like this:

        "workflow_ref": "org/repo/.github/workflows/top-level-workflow.yml@refs/heads/main",
        "job_workflow_ref": "org/repo/.github/workflows/reusable-workflow-3.yml@refs/heads/main",
        "ref": "refs/heads/main",

and the user configured the workflow filename as reusable-workflow-3.yml, then verification will pass.


Adding support for reusable workflows

So, what does all of this mean?

First, we want to change the existing OIDC claims verification so that it checks against the top-level workflow instead of the bottom-level one (from job_workflow_ref to workflow_ref). This should be transparent for most users, but it will break the Trusted Publishers that depend on the (accidental) support for reusable workflows mentioned above. In order to understand how many users this would impact, we will measure how many Trusted Publishers are currently being used with reusable workflows (see PR here: #16364).

Second, in order to fully support reusable workflows we need the user to input 3 new pieces of information: the owner and name of the repository where the reusable workflow lives, and the filename of said workflow. Once we have that, we can compare incoming OIDC tokens against the old information (owner, repo and filename of top-level workflow) and against the new information (owner, repo and filename of the bottom-level workflow).

Third, note how we can trivially migrate existing Trusted Publishers without user intervention, by just copying the existing 3 fields into the new ones, since for non-reusable workflows they should be equal.

And fourth, from the web UI perspective, the user should not need to fill the 3 new fields when creating a Trusted Publisher for a non-reusable workflow. Rather, those new fields can be hidden behind a "Reusable workflow" checkbox, and when left empty they will default to the same values as the old 3 fields (for the same reason as the point above).

cc @woodruffw

@mhils
Copy link
Contributor

mhils commented Jul 31, 2024

Thank you for the detailed write-up, @facutuesca! 😃 🍰

Is there a reason why we cannot match on workflow_ref (the top-level workflow) only? What type of attack are we mitigating by checking jobs_workflow_ref as well?

@facutuesca
Copy link
Contributor

facutuesca commented Jul 31, 2024

Is there a reason why we cannot match on workflow_ref (the top-level workflow) only? What type of attack are we mitigating by checking jobs_workflow_ref as well?

The scenario we're trying to avoid is having overscoped Trusted Publishers (which is why we suggest having a dedicated, publish-only workflow). In the case of only checking for top-level workflows, imagine you have a project with a top-level ci.yml workflow that calls a bunch of other workflows (in the same repo), one of which is publish.yml:

ci.yml -> tests.yml
       -> lint.yml
       -> deploy.yml
       -> build.yml
       -> publish.yml

If we only match on workflow_ref (that is, ci.yml), that means all of the workflows there will be able to publish packages. By adding the extra constraint of job_workflow_ref, we can ensure it's only publish.yml.

@webknjaz
Copy link
Member

@facutuesca and the environment name matching remains the same, right?

@jaraco
Copy link
Contributor

jaraco commented Jul 31, 2024

Third, note how existing Trusted Publishers can be trivially migrated by copying the existing 3 fields into the new ones, since for non-reusable workflows they should be equal.

For my use-case, I've been holding off using trusted publishing in environments that have reusable workflows until this issue is solved, so I won't have any opinion on the migration path. If you think it would be valuable to employ trusted publishing on such a repo, I'd be open to exploring that and providing feedback on the migration path, but my preference would be to await a durable solution.

@facutuesca
Copy link
Contributor

facutuesca commented Jul 31, 2024

@facutuesca and the environment name matching remains the same, right?

@webknjaz Yes, the environment check will still be against the environment claim sent by GitHub


@jaraco

Third, note how existing Trusted Publishers can be trivially migrated by copying the existing 3 fields into the new ones, since for non-reusable workflows they should be equal.

For my use-case, I've been holding off using trusted publishing in environments that have reusable workflows until this issue is solved, so I won't have any opinion on the migration path. If you think it would be valuable to employ trusted publishing on such a repo, I'd be open to exploring that and providing feedback on the migration path, but my preference would be to await a durable solution.

Ah I think I wasn't super clear there. I was referring to the migration from the POV of PyPI: when adding the 3 new columns to the database, we can fill the values with a copy of the existing information (owner, repo, filename). This will be completely transparent to the users, they won't need to migrate or change anything.
I'll reword it to make that point explicit, thanks!

@mhils
Copy link
Contributor

mhils commented Jul 31, 2024

If we only match on workflow_ref (that is, ci.yml), that means all of the workflows there will be able to publish packages. By adding the extra constraint of job_workflow_ref, we can ensure it's only publish.yml.

Thanks, that makes sense! I'm not fully convinced though:

  1. The problem you described applies just as much to non-reusable ci.yml workflows which may be just as overscoped, no? I don't see an inherent reason why reusable workflows would need additional checks.
  2. Adding an extra checkbox and three new fields adds complexity. Putting the PyPI implementation cost aside for a moment, this makes Trusted Publishing harder to set up for users.
  3. If the only goal is to enforce a narrower scope, why not nudge or require the use of environments for new configurations? That helps both non-reusable and reusable workflows. I feel this would be just as effective and would not require users to figure out which workflow path actually goes into which text input.

Is there anything terribly wrong with changing the check PyPI is currently doing from job_workflow_ref to workflow_ref? My understanding is that reusable workflows would then "just work" while being as secure as non-reusable ones1.

Footnotes

  1. The only thing that breaks would be the "accidental support for a subset of reusable workflows" mentioned in https://github.com/pypi/warehouse/issues/11096#issuecomment-2260686299. Depending on what the metrics show, this is either a non-issue or can be easily worked around with an additional check for this case.

@facutuesca
Copy link
Contributor

If we only match on workflow_ref (that is, ci.yml), that means all of the workflows there will be able to publish packages. By adding the extra constraint of job_workflow_ref, we can ensure it's only publish.yml.

Thanks, that makes sense! I'm not fully convinced though:

  1. The problem you described applies just as much to non-reusable ci.yml workflows which may be just as overscoped, no? I don't see an inherent reason why reusable workflows would need additional checks.

True, but I would argue that a parent-level workflow will be overscoped almost by definition, whereas a non-reusable workflow can be made minimally scoped. The argument is: if you only check the top-level ci.yml that means that all current and any future workflows called from ci.yml, both internal but also from external repos, will be allowed to upload packages.

I think it makes sense there to give the user the ability to narrow it down to a single reusable workflow.

  1. Adding an extra checkbox and three new fields adds complexity. Putting the PyPI implementation cost aside for a moment, this makes Trusted Publishing harder to set up for users.

True, but only for those who want to set up reusable workflows (the three new fields would be hidden until the user checks the checkbox). So the user experience for the currently supported use case (non-reusable workflows) should be the same, and the added complexity is for the users who want to use the new feature (reusable workflows)

  1. If the only goal is to enforce a narrower scope, why not nudge or require the use of environments for new configurations? That helps both non-reusable and reusable workflows. I feel this would be just as effective and would not require users to figure out which workflow path actually goes into which text input.

This might be an option. IME though, most projects don't use environments, so making it required would (arguably) be harder for users since they would have to set it up. I'm open to it though, since I might be wrong here.

Is there anything terribly wrong with changing the check PyPI is currently doing from job_workflow_ref to workflow_ref? My understanding is that reusable workflows would then "just work" while being as secure as non-reusable ones1.
The only thing that breaks would be the "accidental support for a subset of reusable workflows" mentioned in

As you mentioned, one issue is breaking currently working Trusted Publishers, which we'll see if it's actually a problem once we have those metrics. The other issue is the discussion we've been having above: if we only check for workflow_ref, repos that use reusable workflows will have a very overscoped Trusted Publisher (any workflows called by the top-level workflow, internal or external, current and future, will have permissions to upload).

@mhils
Copy link
Contributor

mhils commented Jul 31, 2024

True, but I would argue that a parent-level workflow will be overscoped almost by definition, whereas a non-reusable workflow can be made minimally scoped.

Reusable workflows can be made minimally scoped just as well. You may be right that folks are more likely to overscope here, but I don't have any data on this. 🙂

This might be an option. IME though, most projects don't use environments, so making it required would (arguably) be harder for users since they would have to set it up. I'm open to it though, since I might be wrong here.

FWIW I think security-wise environments would be the superior solution here. For example, we have (relatively speaking) lots of folks with write access to the main branch, but only a few selected individuals can approve deployments (and thus push to PyPI).

The one major downside I can think of is that environments are not available for free accounts in private repositories. So that makes it hard to enforce as a strict requirement. If that's a major concern, job_workflow_ref may be the better solution.

Thank you for pushing this entire thing forward! 🎉

@GergelyKalmar
Copy link

I'd definitely prefer the implementation where we can scope on the reusable workflow level, I think that the few optional fields in the setup UI would be totally fine and it's unlikely to confuse anyone.

Environments are great, but there are lots of organizations that already have an established CI/CD setup that does not use environments and have no need for it otherwise, so requiring it just to use this feature would be probably suboptimal.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request security Security-related issues and pull requests trusted-publishing
Projects
None yet
Development

No branches or pull requests

8 participants