-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Alternative registries #4506
Alternative registries #4506
Conversation
Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @matklad (or someone else) soon. If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. Due to the way GitHub handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes. Please see the contribution instructions for more information. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! I think this should be able to copy a lot of test tests/registry.rs
implementation and perhaps parameterize some more of tests/cargotest/support/registry
to get tests up and running as well?
src/cargo/core/source.rs
Outdated
@@ -87,6 +87,8 @@ enum Kind { | |||
Registry, | |||
/// represents a local filesystem-based registry | |||
LocalRegistry, | |||
/// represent an alternate registry | |||
AlternateRegistry(String), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder, could this be folded into the Registry
type above? We could then either store Option<String>
or I'd be fine having a global notion of "default registry" which is defaulted to something like crates-io
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It prints out a slightly different message, indicating that its an alternative registry (e.g. when you update the index). But that could also be covered by Option<String>
.
src/cargo/core/source.rs
Outdated
@@ -223,9 +225,39 @@ impl SourceId { | |||
Ok(SourceId::for_registry(&url)) | |||
} | |||
|
|||
pub fn alt_registry(config: &Config, key: &str) -> CargoResult<SourceId> { | |||
let registries = config.get_table("registries")?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Due to various bugs in the implementation today, using get_table
will end up bypassing the environment variable fallbacks for fetching keys. Perhaps this could attempt get_string
directly (or a similar method) to ensure that config-via-env-var works?
src/cargo/core/source.rs
Outdated
match registries.as_ref().and_then(|registries| registries.val.get(key)) { | ||
Some(registry) => { | ||
let index = match *registry { | ||
CV::String(ref s, _) => s, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like this is supporting both:
[registries]
foo = "https://path/to/index"
and
[registries.foo]
index = "https://path/to/index"
Should we perhaps canonicalize on just one? (or was it a nice-to-have to support both?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did this because its similar to how dependencies work (with the version
key), in case we ever allow optional additional fields in registries
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed yeah, although that was primarily done for ergonomics (as you're basically always doing that). In this case though, maybe we could just canonicalize on the latter to start out with and extend it in the future if necessary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have a strong opinion. :)
9e2fd53
to
32e930c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, should this also add some restrictions for cargo publish
? We want to forbid dependencies from crates.io packages to packages from other registries :)
src/cargo/core/source.rs
Outdated
if let Kind::AlternateRegistry(ref key) = self.inner.kind { | ||
format!("alternate registry `{}` ({})", self.url(), key) | ||
} else { | ||
format!("registry `{}`", self.url()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🚲🏠 : name "alternate registry" does not sound like the best one to present to the user. I can imagine that in a corporate environment "alternative" registry is thought of as a primary one.
Perhaps the messages could look like this?
Updating crates.io registry
Updating registry https://my-company.com/cargo-registry
Hm, now that I've written it down, perhaps we even should use symbolic names for registries for user-facing messages, and not URLs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah it seems reasonable to me to basically universally refer to registries through their "short identifier", e.g. crates-io
or whatever's in .cargo/config
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with identifying them using the symbolic name the user has given it, rather than the index URL.
Don't have a strong opinion about how/if to distinguish crates.io from other registries.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm going to with:
registry `crates-io`
registry `foobar` (https://github.com/myco/foobar-index)
That is, hardcode non alt registry sources to crates-io & show the URL of other registries as well as the key.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
However, currently we say:
registry `https://github.com/rust-lang/crates.io-index
I think giving crates-io instead of the index URL is a nicer UX though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, we show the registry URL because it could be a local path. Gonna keep showing it in all cases, but call the main registry crates-io
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm going to with:
Looks great!
I think giving crates-io instead of the index URL is a nicer UX though.
Yes!
Actually, we show the registry URL because it could be a local path. Gonna keep showing it in all cases, but call the main registry crates-io.
That may just be an artifact of the tests, but I'm fine either way
@withoutboats i wanted to try this out but it's not building :( about to try and fix... https://travis-ci.org/rust-lang/cargo/jobs/277089689#L772 does this need to use the new cargo unstable features stuff? we said we would in the RFC.... |
I think i fixed the build, we'll see :) |
@carols10cents thanks I did a really lazy rebase |
I'm able to use a crate from another registry!!!!!!! Here's my alternate registry (and also crate source): https://gitlab.com/carols10cents/my-alt-registry That project uses one crate from my alt registry and one crate from crates.io, that worked fine too :) Here are some things I ran into:
I added
|
Just pushed up some commits:
Still not addressed:
Also, this should probably be feature gated, right? |
src/cargo/core/source/source_id.rs
Outdated
@@ -293,7 +275,7 @@ impl SourceId { | |||
|
|||
pub fn is_default_registry(&self) -> bool { | |||
match self.inner.kind { | |||
Kind::Registry => {} | |||
Kind::Registry if self.inner.registry_key.is_none() => {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this clause be removed? It seems like it may cause different behavior between:
a = "0.1"
b = { version = "0.1", registry = "crates-io" }
src/cargo/core/source/source_id.rs
Outdated
kind: Kind, | ||
// e.g. the exact git revision of the specified branch for a Git Source | ||
precise: Option<String>, | ||
registry_key: Option<String>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might be best represented as a Registry(String)
below in the Kind
type, perhaps?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is how it was originally represented and I agree its cleaner but it required adding a custom impl of Hash and PartialEq for Kind, whereas this structure can piggyback the existing work to make two git refs normalize.
Possibly I'm just piling on the technical debt here (i.e. we should store these git specific fields in the Kind also), and I could refactor the whole thing a bit instead of doing it more.
@@ -205,7 +205,7 @@ pub fn registry(config: &Config, | |||
src.update().chain_err(|| { | |||
format!("failed to update {}", sid) | |||
})?; | |||
(src.config()?).unwrap().api | |||
(src.config()?).unwrap().api.unwrap() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this avoid the unwrap
and return an error instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Its guaranteed not to fail (unless you've manually corrupted your index or something) because the src we're accessing is crates.io.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah true yeah, but this may help future-proof for custom registries?
src/cargo/util/toml/mod.rs
Outdated
let new_source_id = match (details.git.as_ref(), details.path.as_ref()) { | ||
(Some(git), maybe_path) => { | ||
let new_source_id = match (details.git.as_ref(), details.path.as_ref(), details.registry.as_ref()) { | ||
(Some(git), maybe_path, _) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we'll want to match this here as well, and without backwards compatiblity concerns (like the path
dependency) we should just return an error for both git = "foo", registry = "bar"
in a dependency specification
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I think this code is also the source of the bad error message @carols10cents mentioned above.
@@ -895,7 +896,7 @@ impl TomlDependency { | |||
let loc = git.to_url()?; | |||
SourceId::for_git(&loc, reference)? | |||
}, | |||
(None, Some(path)) => { | |||
(None, Some(path), _) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to above, I think we should return an error if the registry is specified but a path is also specified
Looking good! Some other thoughts:
|
It is only used for pretty printing, and we can't extract it from the lock file (right now in the code I believe if we ever pretty print any source from the lockfile it will claim its from crates-io...). Just doing a reverse lookup from the config presents the problem where someone has created two names for the same URL, and we don't know which to print (that's definitely an edge case though). The 'correct' solution would be to record the registry name for each package to the lockfile, but the lockfile's current syntax isn't really conducive to that. |
Hm ok, then I think I'm in favor of "let's pass |
Any thoughts around the From a quick scan of the current functionality |
@stephanbuys This PR (and the connected RFC) doesn't include support for There's a good chance that changes eventually of course but this is an incremental step, and publish/yank etc are out of scope for this PR. |
Ok - got it 😃In a sense |
@stephanbuys In a sense, yes, that would work. But in connection with this PR, we're also documenting/standarding the index format. To provide the same level of support for publish/yank, we'd need to do the same for the crates.io API. (Obviously if you run a clone of crates.io it would Just Work, but to support people using a different implementation for whatever reason we would need a spec.) |
Can confirm that the |
☔ The latest upstream changes (presumably #4525) made this pull request unmergeable. Please resolve the merge conflicts. |
447a891
to
b865c72
Compare
In the current state of the PR this issue cited by @carols10cents is the only outstanding issue AFAIK:
Right now, I get that it couldn't find This seems like a hard problem to solve correctly because the fact that these come from alternative registries is not roundtripped to the lockfile. I believe an easy solution like gating on the URL not matching crates.io will break the behavior of people who depend on (pure) mirrors but publish to crates.io today. I am inclined to change the lockfile format to track alt-registry dependencies differently from registry dependencies, but so far I've avoided doing that. Thoughts @alexcrichton? |
Alt registry crates with dependencies on crates.io are not working right now, still investigating. Once that works and there's a test for it I think ready to merge. |
src/cargo/core/source/source_id.rs
Outdated
} | ||
None => Err(format!("Required unknown registry source: `{}`", key).into()) | ||
} | ||
if let Some(index) = config.get_string(&format!("registries.{}.index", key))? { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This may be simplifyable further with:
let index = match config.get_string() {
Some(i) => i,
None => bail!(...),
};
// ...
Sounds good to me! Could you also be sure to add a test where a package in an alternate registry depends on another package in an alternate registry? (deps within the same registry) |
@withoutboats, what is the current status of this PR? Is the issue that you hit above difficult to solve, or you just haven't had time to sort it? If there is anything that I can do to move the PR along then let me know. |
@cswindle Sorry for the delay - the issue with depending on crates.io from an alt registry is a bit tricky. Today, the resolver assumes that, for a crate in a registry, all of its dependencies will come from the same registry. How that assumption gets made is not super clear, and the resolver code is a bit complex, so I hadn't found it. You're welcome to take a stab at it if you want! @alexcrichton thought he knew where the issue was, and we were going to pair on it while we were both in Montreal last week but didn't get a chance to. We'll be in Columbus together next week and probably pair on this to get it ready to merge then (Tues or Weds), unless you decide to find the bug first. :-) |
@withoutboats, I took a look into the issue and found that it was caused by the fact that in the registry we do not store the index to use, hence it is always resolving the dependency using the current registry. You can find the code changes that I made to fix this here: https://github.com/cswindle/cargo/tree/alt-regs-fixes I have not yet run through the other tests, but the test for the specific scenario passes. Let me know if you have any queries about my changes. |
Thanks @cswindle! I'm going to be flying tomorrow (and preparing to fly today) but on Tuesday I'll try to merge your change in and see if we can get this landed. 🎉 |
I have now added the extra test that @alexcrichton asked for and have verified that all tests pass, so hopefully come Tuesday you will be able to get this in without needing to do too much. Note that at the moment this does not include adding the registry when publishing, but I think that is fine as I can make those changes in #4568 as that relates to allowing publish. |
@withoutboats, It looks like we both added the same tests so we should remove the duplication before merging. |
@withoutboats, I have merged master into my branch and fixed up the tests. |
Is that a genuine failure on the AppVeyor build, seems odd that it would be specific to the code changes. @withoutboats, can you remove the macro_use in alt-registry.rs as I added it when I was playing around with fixing the build and it is unused and produces a warning. |
Nope, see #4659 |
@bors: r+ Thanks so much @withoutboats and @cswindle! I think this should be able to land now w/ CI fixes and otherwise the code looks great! |
📌 Commit 2445497 has been approved by |
Alternative registries An implementation of alt registries. Still needs to get tested more thoroughly but seems to work.
☀️ Test successful - status-appveyor, status-travis |
Add support for publish to optionally take the index that can be used This form part of alternative-registries RFC-2141, it allows crates to optionally specify which registries the crate can be be published to. @carols10cents, one thing that I am unsure about is if there is a plan for publish to still provide index, or for registry to be provided instead. I thought that your general view was that we should move away from the index file. If we do need to map allowed registries to the index then there will be a small amount of extra work required once #4506 is merged. @withoutboats, happy for this to be merged into your branch if you want, the main reason I did not base it on your branch was due to tests not working on there yet.
An implementation of alt registries. Still needs to get tested more thoroughly but seems to work.