-
Notifications
You must be signed in to change notification settings - Fork 17.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cmd/go: allow proxies to supply only some modules #26334
Comments
Update after slack discussion with @myitcv and @zeebo:
Another concern that was brought up is that with |
The proxy can redirect vgo to another URL as long as that URL implements the download protocol. That's because the net/http client handles redirect status codes internally (up to 10 times by default) The big question here is: can the GOPROXY tell Vgo (during a build) to go use a VCS source instead of the proxy itself? Which is a little different than a simple redirect to another Download Protocol enabled URL. |
@rsc How do you envision the flow of the Proxy telling Vgo to switch to VCS? I'm trying to follow the vgo code and it seems that on an initial build, the initial contact with the proxy's download protocol can be any of these endpoints: This means that the proxy would need to return a consistent code (maybe a 404) on each of these endpoints to signal vgo to reconstruct the Vgo can potentially always hit Another solution, is that the Download Protocol could implement a For efficiency, vgo can potentially send the entire list of modules it wants to probe to the proxy. I haven't fully understood the last 2 days worth of changes to vgo so apologies if I'm a bit off. |
I don't think it makes sense for a proxy to tell the go command "go to this VCS instead". We're trying to migrate to proxy by default and while VCS will probably always be with us, I'd rather not mix the two. I do think it would probably be OK to let GOPROXY be preference list and to also allow some setting like GOPROXY=direct as an explicit name for what the default behavior is. So you could say GOPROXY=https://myproxy/,direct and just let myproxy return a 404 for the things it doesn't know about. Then the proxy isn't in charge of the actual redirect; it's only in charge of "it's not me". |
@rsc that's about what I had in mind. The proxy shouldn't say "go to this vcs", it can just say "I don't have this module" Does that mean the I've altered the Go code if you'd like to look at a reference of what I'm suggesting: marwan-at-work@3767be8 |
@marwan-at-work I'd rather not have newProxyRepo make any network calls. It turns out to be important to delay those as long as possible. I'd rather have the existing GET paths return 404s and then have the methods be able to return some kind of recognizable "not found error" (maybe satisfying os.IsNotExist is enough) and then something at a higher level will try the next repo method down the list. I think we should wait until Go 1.12 regardless. |
@rsc That makes sense since cmd/go checks the cache before making network calls. I'm happy to take on this task if you'd like me to as I'll try to make it work for Athens in the near future. Either way, I'm happy to know the Proxy can dynamically delegate modules fetching back to Go. Thanks! |
@rsc I have another pass at making this work. This time, we won't hit the network until necessary, and we won't need a The idea is that a *proxyRepo can take an alternative Repo interface that it can switch to in case of 404 (or other future codes). Similar to how *cachingRepo works. Feel free to take a look if you get the chance marwan-at-work@5117c8c I see other ways of doing this, such as having a top level Repo interface that accepts a slice of Repos and just tries one at a time in order: (cache, proxy, vcs, etc) So my solution above of course may still be not what how you'd like to solve this problem but would love to hear your thoughts Thanks :) |
Since this issue discusses a change to GOPROXY to allow the list of proxies or direct method - A slightly different case I am thinking of is when the proxy server I use by default is temporarily unreachable or unavailable (possibly in the middle of fetching all dependencies) and even we can't get But @bcmills raised a concern about leaking private package paths in the event of the private proxy failure if we just fall back to the next (direct) blindly. |
@hyangah I'm concerned with blindly falling back to git for two reasons:
|
This change makes it so that a Go Modules proxy can return a 404 http status code which would signal cmd/go to end up fetching the module from VCS instead. Note, any other status besides 404 and 200 will still fail. Fixes golang#26334 Change-Id: Idcb0409d7fa52c8dc2601aec7f8a4274550afe3a
@marwan-at-work code freeze date is today. do you plan to mail in the change for review as described in the contribution guideline? https://golang.org/doc/contribute.html#sending_a_change_github |
Change https://golang.org/cl/147177 mentions this issue: |
Speaking with @hyangah yesterday, I'm concerned about the approach discussed here. The scenario we discussed is offline yesterday was when a company wants to provide a proxy that serves private modules. In this scenario the company should run an internal (intermediary) proxy that contains the information about the private modules. This proxy should be configured to have an upstream proxy that it uses for public requests. If the company wants to have high availability it can run multiple internal proxies. The client should only be configured to use the internal proxy(ies). The company can also provide a whitelist / blacklist for the internal proxy for what upstream packages are allowed/disallowed. |
Thanks! |
|
|
I am a little confused about all this discussion. I thought we were going to do, for GOPROXY=proxy1,proxy2,proxy3:
It's an ordered list, not a parallel lookup. By saying GOPROXY=proxy1,proxy2,proxy3 you are directing the go command to send every import path to proxy1. If you should be splitting half your traffic to proxy1 and half to proxy2 and can't send the proxy2 paths to proxy1, then yes, you need a new proxy0 to split the traffic. But that is (1) fine and (2) not the envisioned use case. The envisioned use case is some company has their own modules on an internal static file server that can pretend to be a Go proxy (because we made static file servers able to do that), and people use GOPROXY=,proxy1. Or another use case is people preferring GoCenter, but that's an incomplete module mirror, so it needs to be backed by a fallback, like "direct" or a more complete mirror. Again, if you care about not leaking paths for proxy2 to proxy1, you wouldn't do this. But there are other cases where you would. Does anyone object to implementing the above semantics for GOPROXY=proxy1,proxy2,proxy3? If so, please explain why. Thanks. |
@rsc I'm not sure the above discussions were about splitting traffic or concurrently pinging the "proxy list". AFAIK, it was about whether we want to have GOPROXY be able to provide multiple URLs or provide only one highly available URL that takes care of proxying to other proxies if it needs to. I'm happy either way, but the CL from above does what you suggested |
What would be the best work around at the moment? |
@RohitRox one work around is to have that proxy |
@marwan-at-work That will be a chore for developers :| |
I've found this to be a big problem today when trying to setup a GOPROXY at my company. There are many instances where there may be a mix of public and private dependencies. The problem we have is that not all of the repos on our internal GitHub instance can be made publicly accessible. If a project has any dependency that is "private", we cannot use GOPROXY at all. |
Tools like Athens and JFrog Artifactory can be used to store private Go modules and in addition, GoCenter can be used to fetch public Go modules. |
@jorng, for Go 1.13 we expect to add a GONOPROXY environment variable that will let you set GOPROXY to a public proxy but avoid the proxy for modules matching a given pattern. |
@rsc: That will be very helpful, at least to work around the issue. I think the GOAUTH stuff may be the best option, once implemented. I’m imagining setting up a custom proxy that can handle authentication (perhaps using our internal SSO) and gate access appropriately. |
Change https://golang.org/cl/173441 mentions this issue: |
@jorng Do you have any help with gos? https://github.com/storyicon/gos |
Change https://golang.org/cl/183845 mentions this issue: |
…non-404/410 response for @latest The @latest proxy endpoint is optional. If a proxy returns a 404 for it, and returns an @v/list with no matching versions, then we should allow module lookup to try other module paths. However, if the proxy returns some other error (say, a 403 or 505), then the result of the lookup is ambiguous, and we should report the actual error rather than "no matching versions for query". (This fix was prompted by discussion with Dmitri on CL 183619.) Updates #32715 Updates #26334 Change-Id: I6d510a5ac24d48d9bc5037c3c747ac50695c663f Reviewed-on: https://go-review.googlesource.com/c/go/+/183845 Run-TryBot: Bryan C. Mills <bcmills@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Jay Conrod <jayconrod@google.com>
Any chance https://golang.org/cl/173441 can be backported into 1.12.x? |
@mikecook The change to the You can get the new behavior by updating to Go 1.13. |
No: there were a lot of interrelated changes in the fetch paths. Besides, we don't generally backport features (only critical bug-fixes, which this was not). |
This is a proposal for extending the vgo download API to add a mechanism to allow proxies to redirect vgo to a VCS. This functionality would be useful if a proxy has only a subset of packages in a
go.mod
file.Currently, if someone adds two modules
"a" v1.0.0
and"b" v1.0.0
to theirgo.mod
file and they then runGOPROXY=myprox.com vgo install
, everything works as expected if the proxy has both modules at the given versions. If not, the command fails.It would be helpful to allow the proxy to tell vgo to fetch one or both of the modules from the VCS if it doesn't have them in its cache. This would be useful for proxy implementations where the proxy will not/cannot cache the module in its own storage or can but doesn't have that module/version in its cache. The Athens project is a present day use case for the latter - it fills its caches asynchronously.
One implementation possibility is adding a
$GOPROXY/a/@v/v1.0.0?go-get=1
network request that expects the same output as already-existing?go-get=1
requests define. This request could be made before starting the download protocol as it currently exists. The new mechanism would allow the proxy to choose to do one of the following for the given module and version identifier:The text was updated successfully, but these errors were encountered: