-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
proposal: replace keyring dependency with generic pip-defined credential helper API #10389
Comments
Sounds like a good thing to me. Pip historically tend to expose too much dependencies' internals to the user, and they generally become a problem down the road (the most significant is at the moment our dependence on pkg_resources and distutils). However, keyring being not an essential part of the packaging mechanism, it is more difficult to gather maintainer interest for this (keyring support was contributed by its author). Would you be interested in working on this? I would be happy to help review the design and implementation. |
One constraint I would note is that we've traditionally been very explicit in not supporting people writing code that uses pip's internal API. From a brief scan of the proposal here, it seems to do a good job of limiting the interface, and I much prefer an approach like this than either the current "magical behaviour change" if keyring is installed, or the other proposals for command line options. So I'm basically in favour of this, as long as care is taken to make sure it doesn't set a precedent that people can write code against pip's internals. |
@uranusjr @pfmoore I've raised an initial draft PR with just the changes for auth.py. I haven't wired it through to actually pull the mapping from pip.conf yet. Been trying to piece together how that actually works. On that note, would something like this be reasonable? I'm not sure of the best way to do it given the constraints of the pip.conf format.
|
Would it make sense to use Entry Points as discovery mechanism for credential helpers ? This implies delegating the configuration mechanism to the helpers themselves. |
I think keeping the configuration of the credentials helper in pip is cleaner, since it allows arbitrary executables to be the helpers (i.e. allowing non-Python based implementations), doesn't require them to manage duplicated configuration stuff, and lets us the dispatch to different credential managers based on the domain. |
a small python wrapper could call non-python implementations ?
fair
credential managers could accept the domain as argument, and pip could try them all until one knows about the domain ? OTOH returning a user and password is only one possible authentication mechanism (people have been asking for ways to inject http headers etc). So the authentication and possibly in the future the http session mechanisms are bound to grow in complexity, which may or may not make them difficult to handle via pip's configuration. |
... on the other hand, if we use an entry point, someone could write a helper (in Python) that did all of that config management, delegation to arbitrary executables, dispatching based on domain, etc. That saves us having to get into the business of designing credential management schemes, and lets interested parties come up with their own mechanisms. I'd expect "standard" helpers to get developed and published, so this wouldn't require everyone to write their own. (Basically, I'm arguing that we have the bare minimum in pip, and let others do the work 😉) |
The problem (I think) with supporting alternative authentication means (e.g. adding headers) is we’ll basically need to expose the entire session interface, which is another internal thing we don’t really want to rely on. If we want this to be as pluggable as possible, pip would need to provide an abstraction layer over |
I don't entirely understand what an entry point is, but it kind of feels like it would be the same situation where it gets used just because it is on the system, and not because the user actually wants it. To me the advantage of having it be explicitly part of the pip config is that the user has full control over what pip does. I can't speak to the need for arbitrary HTTP headers since it isn't a part of my specific use case, but we could have the return value from the helper be a dict rather than just the username/password tuple directly. That would for example allow it to return a |
This sounds very inefficient, and a painful thing for users to debug when it's misbehaving.
While I agree with this principle in general, I think we shouldn't do that here. This has many similarities to PEP 516 vs PEP 517 IMO -- have-everyone-solve-the-same-problem for a simpler core, or a solve-the-problem-in-one-place for simplicity everywhere else. I prefer the latter, and that's also the route we took with PEP 517. :) The problem we have at hand is exactly the same as what docker has (minus the fact that we need to move away from keyring). The exact problem at hand has already been solved by the thing we're modelling against and, honestly, I don't think we're going to be making major usability or maintainability breakthroughs over what they have. This change would also give us a clear point to make this explicitly opt-in as well, which is nice. :) IMO, one of the nice things that we can do here is piggy back on the ecosystem of credential helpers that have been built for that -- instead of coming up with our own protocol here, I think we can basically reuse docker's protocol for this. Allowing downstream users to do something like |
Is this the same as the git credential manager protocol? Do you have a link to a spec? This is the basis of the git one. While I don't want to over-engineer too much here, if this is a general issue, should we go the extra step and standardise? After all, we try to make a point of saying that pip uses standards-based solutions so that we avoid the problem of having so much implementation-defined behaviour that nobody will ever stand a chance of writing a pip replacement. It could be as simple as "Use the docker (git) protocol [add a link to the spec here], using the name (BTW, if standardising is "too complicated" for something this simple, I'd argue that means our process is bad, not that this doesn't need to be standardised!! In which case let's fix our process...) |
Based on a quick reading, it looks like the protocols are different. I think what we should use for the prefix is python if we’re gonna go through standardisation. @pfmoore If I write the PEP for this, would you be willing to sponsor it? |
No, it seems docker and git are slightly different. git sends a bunch of key-value pairs on stdin, and expects a bunch of key-value pairs on stdout in response. docker provides only the hostname on stdin, and expects a JSON object on stdout in response. See https://docs.docker.com/engine/reference/commandline/login/#credential-helper-protocol (specifically the part about the
I feel "python" might be too vague/generic for this. To me this has to do with private PyPI (or at least PyPI-like) repos specfically. In any case, if the prefix is just "python", where would the configuration file mapping hostnames to credential helpers (i.e., the equivalent of ~/.docker/config.json) go? I imagine we wouldn't use pip.conf for that. |
Yep, certainly 🙂 |
Thanks for the link. I agree, this protocol looks nice and simple, and suitable for our needs. One small thought - is it stable? If we tie ourselves to the Docker protocol, could tracking changes to it become a burden?
I'm not sure why we even need a prefix. If we simply have an entry in In fact, if we only allow a single credential helper to be configured, and state that it follows the Docker protocol, we may not even need a PyPA standard here. We'd only need a standard if we wanted to configure the name of the helper globally, and doing that now is likely premature (unless we know of other projects that would want access to the information - poetry? pipenv?) Edit: Of course, we might want to make a PyPA policy that this is how projects handle credential management. But the PyPA doesn't really include policy-setting in its remit, so 🤷 |
Technically you do not. Docker does it more as a convention thing. For example, if you configure it to use "foobar" as the credential helper, it will actually run "docker-credential-foobar". But there is no reason I can think of that pip could not choose to instead say that you have to specify the whole command name and there won't be any prefixing "magic".
Obviously I cannot speak on behalf of those project maintainers, but as a consumer it would certainly be preferable that all the PyPI stuff work consistently. (That's actually one of the main disadvantages of keyring today - as far as I can tell pipenv doesn't use it so we end up needing various one off solutions.) You could certainly allow only one helper, which in turn maybe outsources to other helpers based on the specific hostname. At worst, you could always change pip to allow for the user to directly configure helpers per hostname in the future, with this singleton helper serving as a catch-all. (This is basically the concept of docker's credential store vs. the credential helpers.) |
I've taken a swing at a mechanism for an override credential provider: 0205e2e |
As pointed out in the ticket description,
This has lead to issue #11721 being opened which I tried to address through two different PRs: #12473 and #12496 (I was not aware of this issue at the time). In the end, the discussion on #12496 came to the conclusion that the way pip currently manages credentials is overly complicated and based on heuristics that are broken for some users and that an overhaul was needed to properly solve it. This proposal of delegating credentials management to an external helper could in my understanding also solve #11721. However, the discussion in this issue was focused on the technical aspect so far and didn't touch the question of the backward compatibility / deprecation cycle. The way I'm imagining it is that the current Unfortunately I cannot really commit on any timeframe for trying to implement such a |
Would there be any consideration for using an existing credential helper specification, rather than rolling a custom one? The one I've used most is EngFlow's as it's used by Bazel (design doc), which itself was inspired by the docker credential helpers mentioned in the OP. |
Currently pip directly imports keyring in order to use it for authentication. Furthermore, to avoid a performance hit in the majority case, it will only consult keyring if the original request fails with 401. This has caused a few issues:
I would like to propose that the direct dependency on keyring be removed, and replaced with a more generic concept, akin to docker's credential helper concept. https://docs.docker.com/engine/reference/commandline/login/#credentials-store
The basic idea is that the pip configuration be able to specify a credential helper library for a given PyPI hostname. If there is no entry for a hostname, then requests are sent without any authentication. This allows the majority case of talking to public PyPI to continue to work as it does today with no performance hit. If a hostname does have an entry, then it is expected to map to a library that exposes a
get_pip_credentials(hostname)
function, which takes the download hostname and returns the username and password to use. The user just needs to arrange for this library to be importable, so for example they could write a custom library and add it to theirPYTHONPATH
.For keyring (excluding macos), a very simple adapter around the keyring API could be created, and ideally could even be part of keyring itself. For macos, I can make my own shim to pull the keyring username from some environment variable of my choosing. For CI/CD, a simple credential helper could be made to pull the username and password from pre-canned environment variables. (Since this logic lives entirely outside pip, the issue of multiple index urls mentioned here is not relevant.)
If a user doesn't want to use keyring, they simply would not specify it in their pip configuration. That means that by default, keyring would never be used at all. But it also means that users that do want to use it are not required to pass some flag to pip all the time.
If the credential helper API is standardized, then it can be shared with other dependency management tools such as pipenv, instead of the situation today where the two have completely different approaches. (For reference, most docker-like tools, such as skopeo and buildkit, will abide by your credential helpers, thus providing a consistent experience.)
The text was updated successfully, but these errors were encountered: