Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cargo package --workspace is not very useful #10948

Open
3 of 4 tasks
conradludgate opened this issue Aug 8, 2022 · 16 comments · Fixed by #13947
Open
3 of 4 tasks

cargo package --workspace is not very useful #10948

conradludgate opened this issue Aug 8, 2022 · 16 comments · Fixed by #13947
Labels
A-workspaces Area: workspaces C-feature-request Category: proposal for a feature. Before PR, ping rust-lang/cargo if this is not `Feature accepted` Command-package S-accepted Status: Issue or feature is accepted, and has a team member available to help mentor or review Z-package-workspace Nightly: package-workspace

Comments

@conradludgate
Copy link

conradludgate commented Aug 8, 2022

Tracking

Unstable flag: -Zpackage-workspace

Stabilizing this would also close

Implementation

Changes since original plan

  • Added--registry and --index flags to cargo package to know what registry will be used for generating the Cargo.lock file as if the internal dependencies were already published
  • cargo publish is not atomic but it does verify all before publish

Open questions

  • Are we ok with a slight compatibility breakage in cargo package? See cargo package --workspace is not very useful  #10948 (comment)
  • Are we ok stabilizing this and cargo publish multiple packages at once #1169 at the same time? Currently, they are behind the same flag
  • What is the desired behavior for the publish timeout? Publish workspace #14433 uploads the crates in batches (depending on the dependency graph), and we only timeout if nothing in the batch is available within the timeout, deferring the rest to the next wait-for-publish. So for example, if you have packages a, b, c then we'll wait up to 60 seconds and if only a and b were ready in that time, we'll then wait another 60 seconds for c.
  • What is the desired behavior when publishing some packages in a workspace that have publish = false? Publish workspace #14433 raises an error whenever any of the selected packages has publish = false, so it will error on cargo publish --workspace in a workspace with an unpublishable package. An alternative interface would implicitly exclude unpublishable packages in this case, but still error out if you explicitly select an unpublishable package with -p package-name (see -Zpackage-workspace is not smart about publish = false #14356). Publish workspace #14433 behavior is the most conservative one as it can change from an error to implicit excludes later.

Known issues

Problem

Let's say you have a workspace

workspace/
  package_a/
    Cargo.toml (name="a", version="0.1.0")
  package_b/
    Cargo.toml (name="a", version="0.1.0", [dependencies] a="0.1.0")
  Cargo.toml (members = ["package_a","package_b"])

If you have already published a to crates.io, then cargo package --workspace will complete successfully.

Now, you update a to 0.1.1, and update b to use that new minimum version. If you try cargo package --workspace again, it will no longer work. You will get an error that 0.1.1 was not found.

This happens because cargo package makes a dummy package for your verification and it strips out all workspace and path information. This means it tries to retrieve the a 0.1.1 from the registry, which it fails to do.

Proposed Solution

If you have a workspace where some package b depends on another package a, where a is both specified by a version AND a path, AND that path is within the workspace members, then the following will happen:

cargo package --workspace

will make the new dummy project to verify the crates, but it will leave in the path information for a within b.

Notes

In my experience, I've never had a workspace where all crates are independent. There's always at least one that depends on another crate. When managing private registries, it's not uncommon to cargo package and upload the .crate file manually.

Currently, it's necessary to package each workspace member individually and upload them in the order they depend.

Ideally, we can run a single

cargo package --workspace

after bumping all versions, then get all of the .crate files and upload them in a batch.

@conradludgate conradludgate added the C-feature-request Category: proposal for a feature. Before PR, ping rust-lang/cargo if this is not `Feature accepted` label Aug 8, 2022
@tomharmon
Copy link

This sounds very useful!

@tomharmon
Copy link

tomharmon commented Aug 11, 2022

I took a look. First time contributor to cargo so just trying to find the relevant parts of the codebase. It looks to me like build_lock() in src/cargo/ops/cargo_package.rs, then TomlManifest::prepare_for_publish() in src/cargo/util/toml/mod.rs might be where the logic of @conradludgate described may be:

This happens because cargo package makes a dummy package for your verification and it strips out all workspace and path information. This means it tries to retrieve the a 0.1.1 from the registry, which it fails to do.

So I believe that here we is where we would need to implement @conradludgate 's proposed solution, and to do so in the order the crates depend on another.

This is after a quick look at the code for the very first time ever, so I may be off base. I'm going to keep looking and try to confirm what I think. Also I'm on Zulip if people want to chat/mentor there

EDIT: After this Zulip discussion, due to the limited bandwidth of cargo team members new features should be accepted first before opening a PR. I am going to work on some issues raised in that thread that already have momentum to try my first contribution

@weihanglo
Copy link
Member

One background knowledge: cargo publish is roughly cargo package + call crates.io API. Therefore, some shared code path between cargo publish and cargo package might affect each other. The requirement of a dependency existence on a registry might come from what cargo publish needs: a .crate file is supposed to be distributable.

One thing a bit weird is that packaging a crate with all verifications but not publishing the path dependencies. If cargo permit that, it make me feel like we try to convince cargo that our build is verified but only locally. As aforementioned a .crate file is supposed to be distributable. This rule may not be true anymore if doing so. Maybe we need to think more on whether enabling packaging with unpublished path dependencies is on the right direction, but I'd say the issue you have does exist and people hit it.

At this time being, a good tool for releasing crates is cargo-release, created by sunny87 and epage. I haven't tried it but I believe it works. Perhaps @epage can give you more experiences on this topic :)

@weihanglo
Copy link
Member

BTW, this is somehow related to #9260 (comment). There are more discussions in the wild regarding a more sequential and smooth packaging/publishing process, though I cannot find them now.

@conradludgate
Copy link
Author

Cargo release is a good tool for workspaces. Unfortunately, we can't use it as-is since we don't have the ability to publish using cargo. However, its something we could fork and fix to work with our own registry

@epage epage added the S-accepted Status: Issue or feature is accepted, and has a team member available to help mentor or review label May 6, 2024
@torhovland
Copy link
Contributor

torhovland commented May 9, 2024

@jneem and I are looking into this. Some of the aspects we are considering are:

  • It's not so much that cargo package makes a dummy package with paths stripped, but the actual package that will get published has its paths stripped. In order to do verification when some of the (workspace) dependencies are not yet published, we will either have to do verification before we strip the paths, or we will have to create a (dummy) package just for verification, with the paths reinstated.
  • Whether or not to package one crate at a time (like how it works today), or to first prepare a workspace with all paths intact, and then verify each crate. The latter case would be simpler when there are dependency chains across the workspace.
  • Can we assume that all intra-workspace dependencies would be for the latest version, i.e. the path dependency? Or do we need to allow that some crates can depend on a specific version of another workspace crate? It would simplify things if cargo package could just throw an error if all workspace dependencies and current crate versions don't match up. In the case where you do want to depend on older versions, maybe you wouldn't try to publish the workspace as a whole anyway, but would rather publish individual crates?
  • Currently, Cargo does not allow packaging and publishing a crate that depends on another crate unless that dependency is already published. With the change discussed in this issue, it's possible to package both crates and then only publish one of them. Would we need to add safeguards to cargo publish where it confirms that all dependencies exist (basically another round of verification)? If so, we also need to make sure it publishes the crates in the correct order. But this is really a separate concern (multi-package publishing).

If anybody has opinions on this, feel free to chime in.

@epage
Copy link
Contributor

epage commented May 10, 2024

@torhovland taking on a task like this, you might want to coordinate more with the Cargo team. For example, we have Office Hours.

One way this could possibly be split up is

  • Implement this for --no-verify
  • Implement verification

It's not so much that cargo package makes a dummy package with paths stripped, but the actual package that will get published has its paths stripped. In order to do verification when some of the (workspace) dependencies are not yet published, we will either have to do verification before we strip the paths, or we will have to create a (dummy) package just for verification, with the paths reinstated.

For verification, we should be verifying the generated .crate, rather than an intermediate to ensure maximum verification.

I wonder if we can inject patches rather than changing anything about the code. This can all be done in-memory, rather than writing it out to disk.

Whether or not to package one crate at a time (like how it works today), or to first prepare a workspace with all paths intact, and then verify each crate. The latter case would be simpler when there are dependency chains across the workspace.

I have no preference between whether we package+verify one at a time or package all then verify all. We should likely do separate compilation per verify though. #5931 would speed up the compile times for this.

Can we assume that all intra-workspace dependencies would be for the latest version, i.e. the path dependency? Or do we need to allow that some crates can depend on a specific version of another workspace crate? It would simplify things if cargo package could just throw an error if all workspace dependencies and current crate versions don't match up. In the case where you do want to depend on older versions, maybe you wouldn't try to publish the workspace as a whole anyway, but would rather publish individual crates?

There are times to depend on old versions of packages.

  • Dev-dependencies for comparing across versions (usually for major versions)
  • Semver hack (for major versions)
  • You can have access to a registry variant of a package through transitive dependencies, either semver compatible or not.

Currently, Cargo does not allow packaging and publishing a crate that depends on another crate unless that dependency is already published. With the change discussed in this issue, it's possible to package both crates and then only publish one of them. Would we need to add safeguards to cargo publish where it confirms that all dependencies exist (basically another round of verification)? If so, we also need to make sure it publishes the crates in the correct order. But this is really a separate concern (#1169 (comment)).

This isn't relevant to this issue but the follow up one. Let's keep the conversations in each Issue focused and move this over to there.

@torhovland
Copy link
Contributor

Thanks for your input, it's been noted.

@torhovland taking on a task like this, you might want to coordinate more with the Cargo team. For example, we have Office Hours.

Sure, we will show up there.

The rough idea we have so far is:

  1. Determine all intra-workspace path dependencies.
  2. Package each crate using the existing code (without trying to compile/build it, or otherwise resolve online dependencies).
  3. Unpack all packaged crates in a temporary directory, giving each one a random dir name.
  4. Go through all workspace dependencies from step 1 and fix them up in each crate. Can we do this in-memory?
  5. See if this builds.

@torhovland
Copy link
Contributor

Running into a chicken-and-egg problem with the lock files:

When packaging, Cargo strips away all path dependencies and generates lock files by expecting to find dependent packages online. So that elements like this can be put into the lock file:

[[package]]
name = "my-dep"
version = "0.1.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "780f1cebed1629e4753a1a38a3c72d30b97ec044f0aef68cb26650a3c5cf363c"

But when generating a lock file using just a local path dependency, we do not get any source or checksum. While we could add on a source and a checksum after the fact, we can't really assume that the dependency will end up on crates.io. But we also cannot publish a crate that doesn't specify where to find its dependencies.

So it seems that packaging and publishing are strongly coupled, and there's no way around packaging+publishing one crate at a time.

Unless we introduced something like source = same-as-this-one.

Any ideas?

@torhovland
Copy link
Contributor

A possible solution is to let cargo package have the same --index option that cargo publish has for indicating which registry to use. So you'll use that to indicate which registry you intend to publish to.

@torhovland
Copy link
Contributor

Assuming we could use the --index option suggested above, we still need to figure out how to modify the generated lock files to include a source and a checksum. Ideally, we would like to modify the new_resolve in build_lock() before it gets serialised. But a Resolve seems quite immutable.

We could of course try to modify the lock file after serialisation, but that doesn't seem very nice. The alternative is to modify the resolver code so it can treat a crate as if it was pulled from a registry.

@epage
Copy link
Contributor

epage commented May 15, 2024

We have two problem steps in cargo package --workspace

  • Generating the lockfile for each .crate: path dependencies won't be in the registry yet, so this will fail.
    • Above they talked about forcing path dependencies to be used during the lockfile generation and then patching up the lockfile afterwards with the correct source and checksum
    • In theory, we could do some kind of decoration of crates.io to allow this in-memory...
    • Regardless of the method we use to workaround that, to generate the source, it seems like we need to know where the .crate files are intended for so we'd need --index / --registry flags
  • Verify step for each .crate: the decompressed .crate files will need to depend on each other, rather than their registry, to build
    • We could do in-memory patches
    • Maybe there are alternatives that build on what we did for the lockfile?

@Eh2406 or @arlosi any thoughts or tips on this?

This was referenced May 17, 2024
bors added a commit that referenced this issue Jun 11, 2024
Add local registry overlays

This PR adds (private to cargo internals) support for local registry overlays, in which you can locally pretend to add packages to remote registries; the local packages will have the same source ids as the remote registry that you're overlaying.

There are two ways to set up these overlays: programmatically using `GlobalContext::local_overlays` and through the `__CARGO_TEST_PACKAGE_CONFUSION_VULNERABILITY_DO_NOT_USE_THIS` environment variable. You can't set up these overlays with `.cargo/config`.

The motivation for this is [packaging workspaces](#10948). When we're packing a workspace, we'd like to be able to pretend (for lockfile generation and verification) that some workspace packages are already published even though they aren't.
@epage
Copy link
Contributor

epage commented Jul 26, 2024

#13947 adds support for this in nightly behind -Zpackage-workspace.

There is a compatibility issue to decide on for stabilization:

This PR introduces a case where cargo package will fail where it did not before: if you have a workspace containing two packages, main and dep@0.1.0, where main depends on dep@0.1.0 and dep@0.1.0 is already published in crates.io then attempting to package the whole workspace will fail. To be specific, it will package dep@0.1.0 successfully and then fail when trying to package main because it will see the two different packages for dep@0.1.0. Note that cargo publish will already fail in this scenario.

This shouldn't interfere with crates.io running cargo package for each package to diff the .crate files

  • This might interfere if someone tried to verify their "published" MSRV by running cargo package.

The failure could be avoided by changing the local overlay source to not error out if there's a duplicate package; see here. However, failing early has the advantage of catching errors early.

@bors bors closed this as completed in b5d44db Jul 26, 2024
@epage epage reopened this Jul 26, 2024
@epage
Copy link
Contributor

epage commented Jul 26, 2024

Re-opening to track stabilization

@weihanglo
Copy link
Member

This seems to be a relatively large feature in terms of lines of code. Should we have an unstable doc section under https://doc.rust-lang.org/nightly/cargo/reference/unstable.html for it?

@epage epage added the Z-package-workspace Nightly: package-workspace label Sep 5, 2024
bors added a commit that referenced this issue Sep 5, 2024
Document -Zpackage-workspace

Adds some unstable documentation on the `-Zpackage-workspace` feature, as requested in [#10948](#10948 (comment)). This documentation assumes that #14433 gets merged.
bors added a commit that referenced this issue Sep 6, 2024
Publish workspace

Adds support for simultaneously publishing multiple (possibly inter-dependent) packages in a workspace, gated by the `-Zpackage-workspace` flag.

Questions to be worked out through stabilization:
- Are we ok stabilizing this and #10948 at the same time?  Currently, they are behind the same flag
- What is the desired behavior for the publish timeout? This PR uploads the crates in batches (depending on the dependency graph), and we only timeout if nothing in the batch is available within the timeout, deferring the rest to the next wait-for-publish. So for example, if you have packages `a`, `b`, `c` then we'll wait up to 60 seconds and if only `a` and `b` were ready in that time, we'll then wait another 60 seconds for `c`.
- What is the desired behavior when some packages in a workspace have `publish = false`? This PR raises an error whenever any of the selected packages has `publish = false`, so it will error on `cargo publish --workspace` in a workspace with an unpublishable package. An alternative interface would implicitly exclude unpublishable packages in this case, but still error out if you explicitly select an unpublishable package with `-p package-name` (see #14356). This PR's behavior is the most conservative one as it can change from an error to implicit excludes later.

This is part of #1169
@epage
Copy link
Contributor

epage commented Oct 7, 2024

While implementing support for this in cargo-release, I realized I might need to still publish individual packages myself if

  • packages have multiple registries to publish to
  • packages have different platform targets to verify on

For --verify, I'm only doing it if all packages have it enabled (default).

Features are another messy case but no different than the rest of Cargo and I'm foisting that onto my users.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-workspaces Area: workspaces C-feature-request Category: proposal for a feature. Before PR, ping rust-lang/cargo if this is not `Feature accepted` Command-package S-accepted Status: Issue or feature is accepted, and has a team member available to help mentor or review Z-package-workspace Nightly: package-workspace
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants