-
Notifications
You must be signed in to change notification settings - Fork 96
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
A visitor API for exploring a package's features and their dependencies #281
base: main
Are you sure you want to change the base?
Conversation
a2ded76
to
161d86c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Personally, I'm unfortunately not a fan of this API. I find it very confusingly structured and documented, and it isn't clear to me that the flexibility a visitor offers compared to other approaches is valuable for many use cases.
I can imagine two modes where feature information is useful:
- Which things are directly enabled by which features? This is analogous to "what's written in the Cargo.toml next to each feature."
- What are all the things enabled by each feature? This is analogous to "if I enable feature X, what do I actually get."
Obviously, a visitor can cover both modes and everything in between. But it isn't clear to me that there is any use case that lies in between. So I can't justify the complexity of either maintaining this API, nor of users learning how to use it.
For my 2 cents, I'd prefer an API that offered functionality for each of the two modes, and was designed to be simple and easy to understand.
src/visit.rs
Outdated
/// Visits a missing dependency. | ||
/// | ||
/// Return `Ok(())` to continue the walk, or `Err(…)` to abort it. | ||
fn visit_missing_dependency(&mut self, dep_name: &str) -> Result<(), Self::Error>; | ||
|
||
/// Visits a missing package. | ||
/// | ||
/// Return `Ok(())` to continue the walk, or `Err(…)` to abort it. | ||
fn visit_missing_package(&mut self, pkg_name: &str) -> Result<(), Self::Error>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm confused about the terminology of missing dependency
and missing package
. In what sense are they missing, given that we seem to know their names based on the function signatures?
While all the trait items are documented, it would be good to frame their documentation in terms of "what does the user need to understand to choose a good implementation here." Right now the documentation seems to be written from the perspective of "what does the code driving the visitor need to know to drive it properly", which isn't that helpful for end-users.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The documentation was mostly copied from the corresponding APIs on cargo
itself, but the names used in the implementation slightly drifted afterwards. I've cleaned things up and also added more information. 👍
The need for visit_missing_dependency(…)
and visit_missing_package(…)
is twofold:
-
The
Metadata
only containspackages
for crates that were part of the active crate graph, i.e. crates that either were workspace members, non-optional dependencies or optional dependencies that where enabled via corresponding features. So if you walk the feature graph from a feature that would enable a dependency but that dependency wasn't also activated viaMetadataCommand.features(…)
, then lookup of the correspondingPackage
onMetadata
will fail. But we need thePackage
to look up its features for the recursion. (I'd expectvisit_missing_dependency(…)
to be unnecessary for validMetadata
instances and haven't seen it fail myself so far.) -
The
Metadata
might have been deserialized from an unchecked JSON blob with packages/dependencies missing. I'll leave deciding on whether or not we want to gracefully handle such situations to @oli-obk, but wanted to avoid pestering the draft implementation with.unwrap()
s.
These failure modes are also partly what made me switch to a visitor API.
I had initially implemented a single method on Metadata
but the implementation very quickly became unwieldy and inflexible:
fn transitive_feature_dependencies<'a>(&'a self, package: &'a Package, feature_name: &str) -> Result<BTreeMap<&'a Package, BTreeSet<FeatureValue>>> {
// ...
}
(When I initially added said method on Metadata
itself I stumbled upon the issue of expecting it to return -> Result<…>
, which in the crate is an alias for -> Result<…, cargo_metadata::Error>
, with the latter being an error type that so far is 100% concerned with I/O-layer errors. Adding such new logic-layer errors to the same error type felt wrong to me for some reason. Mostly because I'd expect 99% of users to only want to read a manifest file. Those would now have to figure out how to properly deal with such additional, yet unreachable error cases. Returning a different error type from these new APIs however would be confusing as well, as Result<…>
implies that it covers all errors emitted by the crate. But maybe this just hints at the errors needing some work to be done as well, to make things fit well. But this feels to me like it needs more of a "broader vision" to get right and feels somewhat out of scope for this PR.)
But whether or not this method should fail on missing packages depends on the use-case, I think.
With a visitor this decision is totally up to you, without bloating the crate's API with options.
Similarly one may want to be able to pick between different collection modes: recursive (i.e. transitive) and non-recursive (i.e. immediate) feature dependencies. Again, a visitor API leaves this up to the user, without bloating the crate's API with options.
src/visit.rs
Outdated
/// Visits a feature that's enabling a dependency with `dep:dep_name` syntax. | ||
/// | ||
/// Return `Ok(<walk>)` where `<walk>` indicates whether or not to walk the feature's downstream dependencies, | ||
/// or `Err(…)` to abort the walk. | ||
fn visit_dep(&mut self, package: &Package, dep_name: &str) -> Result<bool, Self::Error> { | ||
let (..) = (package, dep_name); | ||
Ok(true) | ||
} | ||
|
||
/// Visits a feature that's enabling a feature on a dependency with `crate_name/feat_name` syntax. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Both of these doc comments say "visits a feature ..." but they both seem to take package: &Package
and nothing resembling a feature. I'm confused about how this interface works =/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The three visitor methods correspond to the variants of the FeatureValue
type (which was copied from cargo
itself). Every time the walker passes such a value in the feature graph it calls the corresponding method on the visitor that it carries along.
The variants of said type unfortunately have rather confusing names already: Feature
, Dep
, DepFeature
, especially when considering that FeatureValue
is itself the value of a "feature dependency" (i.e. a feature depended on by another feature). Finding proper names for these was … a challenge. One that I decided to mostly gloss over for now. #bike-shedding.
Preferably the FeatureValue
enum would have new-type variants, rather than struct variants, so that each variant's type could be passed to the corresponding visitor method and also so that the corresponding type of feature value could be handled as a single value. But! By promoting each variant to individual types each of them would have to have their own Serialize
/Deserialize
/FromStr
implementations. An explosion of complexity and change that I wanted to avoid if possible. That said I do think that the end goal should be a new-type based enum.
I'd 100% agree if such a Visitor API would be the only API provided. Going with a visitor API doesn't have to mean "difficult to use". Instead it allows you to neatly compartmentalize the logic in a "walker" and avoid diluting the API of Also speaking from an implementation point of view: migrating to a visitor made the implementation of the feature graph traversal logic much simpler and cleaner. And by being driven from outside having a visitor also makes the implementation less likely to accumulate all kinds of intricate control logic (to covered different use cases, etc) in the future. |
Pulling your comment out of the thread for broader visibility:
I don't find this argument convincing, personally. Leaving more things up to the user is not always a good choice.
I think what's missing is a simple way to answer the most common questions:
Something a user could easily find, quickly understand, and immediately use in their own code. A visitor API in my book accomplishes none of that. I'm not the |
And I wouldn't expect you to. For the 80% I'd expect convenient methods to exist. The for other 20% a visitor API is preferable. One such use case being mine (the one you reminded me of, when you brought up your use case): I'd need to be able to selectively block certain downstream features from (it and its dependencies) being collected. With a method that only returns the immediate dependencies I'd be out of luck as I would have to write the recursive traversal logic myself, which is arguably more difficult (to do right) than implementing a visitor. And a one-off method for transitive feature dependencies wouldn't be of much use either. |
I also feel like the visitor is overkill, but until we get generators, I can also see how an iterator based version may be annoying to implement. I'd be fine adding this under an |
An exploration for how
Metadata
could provide a flexible API for processing a package's feature dependency graph.Resolves #279