Skip to content
This repository has been archived by the owner on Sep 9, 2020. It is now read-only.

gps: fine grained source transitions #1250

Merged
merged 2 commits into from
Feb 24, 2018
Merged

gps: fine grained source transitions #1250

merged 2 commits into from
Feb 24, 2018

Conversation

jmank88
Copy link
Collaborator

@jmank88 jmank88 commented Oct 9, 2017

What does this do / why do we need it?

This PR contains a set of changes primarily motivated by making sourceGateway state transitions more fine grained and special cases explicit, so that the persistent cache can be fully leveraged when integrated. From the changelog:

Reduce network access by trusting local source information and only pulling from upstream when necessary

What should your reviewer look out for in this PR?

Should a note be added to the CHANGELOG? Did this.

Which issue(s) does this PR fix?

Toward #431

Fixes #415

return 0, err
}
}
//TODO(jmank88) broadcast sg.src.upstreamURL() changes here
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps now is the time to address this, but I don't have a simple solution. Remapping the URLs would catch many cases, but it seems like there is always a worst case scenario where two of (or sets of) sources/gateways/caches have to be merged after both being in use. However, I may be considering a more general case than is necessary. This partially depends on the possible URLs available from maybeSources, and whether a source's upstreamURL is always from this set (meaning we could map the possibilities ahead of time).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think our best solution is likely to be in changing the scopes of responsibility, here. something like the following: we create some new indirection layer that basically acts like a persistent sourceGateway factory: it takes a maybeSources and occupies all the corresponding URL slots in the sourceCoordinator's URL lookup map, and spits out a sourceGateway on request.

now, the sourceCoordinator is only responsible for mapping to families of related URLs, and the sourceGatewayFactory (or whatever) is doing the final layer of mapping work. it's likely that we'll need to rework some of sourceCoordinator.getSourceGatewayFor() to accommodate this, possibly including the way that sourceCoordinator.srcs and sourceCoordinator.nameToURL actually work.

that expands the scope of this PR significantly, though, and my gut currently says we'll be able to defer it until a later PR. for now, we'll probably be able to get away with just having a hard failure here where we might otherwise be able to recover by going back to other, possible upstream URLs.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for now, we'll probably be able to get away with just having a hard failure here where we might otherwise be able to recover by going back to other, possible upstream URLs.

FWIW I've been running with a panic on any url change and it hasn't happened yet (mostly ran tests and cockroachdb). A regular error is probably sufficient though.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it takes a maybeSources and occupies all the corresponding URL slots in the sourceCoordinator's URL lookup map, and spits out a sourceGateway on request.

Can we be certain about all of the possible URLs ahead of time? e.g. gopkgin code currently returns one url from maybeSource, but a different aliasURL from gopkginSource. That one would be trivial to change, and the others look pretty standard and static. For some reason I had it in my head that it would be possible to ping upstream and be forwarded to another remote or something. If that is not the case, then we may be able to get away with the existing data structures (with more mapping) and with some mechanism for merging additional maybeSources into a gateway.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW I've been running with a panic on any url change and it hasn't happened yet (mostly ran tests and cockroachdb).

I take that back. Now I'm seeing test failures on CI like: upstream source url changed from "ssh://git@github.com/sdboyer/gpkt" to "https://github.com/sdboyer/gpkt" which seems like something we need to handle now.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has been fixed.

existsCallsListVersions() bool
// listVersionsRequiresLocal returns true if calling listVersions first
// requires the source to exist locally.
listVersionsRequiresLocal() bool
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These comments became out of sync. Any better name/doc suggestions before fixing?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these seem fine for now. there's a refactoring in store for these, anyway, i think.

@jmank88
Copy link
Collaborator Author

jmank88 commented Oct 10, 2017

@sdboyer This is green now. Let me know what you think of the gateway mapping change, and whether I should squash any or all of the commits.

@jmank88 jmank88 changed the title gps: fine grained source transitions [WIP] gps: fine grained source transitions Oct 11, 2017
@jmank88
Copy link
Collaborator Author

jmank88 commented Oct 11, 2017

Don't merge yet.

Tagging as WIP since I'm getting real-world failures, even though tests are passing.

@jmank88
Copy link
Collaborator Author

jmank88 commented Oct 12, 2017

After refactoring PackageTree caching, this is stable again and should be green soon.

@jmank88 jmank88 closed this Oct 15, 2017
@jmank88 jmank88 reopened this Oct 15, 2017
@jmank88
Copy link
Collaborator Author

jmank88 commented Oct 15, 2017

Rebased.

@jmank88 jmank88 changed the title [WIP] gps: fine grained source transitions gps: fine grained source transitions Oct 17, 2017
Copy link
Member

@sdboyer sdboyer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

holy crap, there's a lot in here 🎉

by and large, this looks like great progress. i've got a couple of more specific questions, but i'd ideally like to get this in ASAP so that it can percolate as long as possible before next release.

@@ -0,0 +1,34 @@
// Copyright 2017 The Go Authors. All rights reserved.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'm a little uneasy about giving this its own file. i appreciate that it doesn't really fit where it was, but...well, we already have a bunch of antipatterns with errors, and i think it's becoming increasingly harmful.

no change to make here, just making a note of it.

@@ -432,8 +476,20 @@ func TestSourceCreationCounts(t *testing.T) {
ProjectIdentifier{ProjectRoot: ProjectRoot("github.com/sdboyer/gpkt"), Source: "https://github.com/sdboyeR/gpkt"},
mkPI("github.com/sdboyeR/gpkt"),
},
namecount: 6,
srccount: 1,
names: []string{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sometimes i feel like you just follow my code around, tidying up icky tests ❤️

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Heh, well this one had to change. Actually the name should probably be changed.

} else if srcg == srcg4 {
t.Error("explicit http source should create a new src")
} else if srcg != srcg4 {
t.Error("explicit http source should reuse autodetected https source")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hang on, this feels like a significant departure from existing logic. is this suggesting that http should fold in with https?

what about doing this is worth breaking the simpler rules of the existing model?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What exactly are those rules? Since the map was keyed on both name and url, I thought it was in line with existing behavior to fold them all together. Was sharing a map only valid since the url was static?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so the "rule" here - no, not expressed anywhere, because i have, as we know, not been great at articulating these really low-level invariants anywhere, 💯 my fault - is that if an explicit request is made for e.g. an http protocol access pattern for a URL, then we have to respect it, and that's how we access. it's only in the default inference case, where no protocol is provided (a bare import path) where we can "pick" from the various possibilities, and where the specific one used may vary from one run to the next (but, once selected, is fixed within the duration of any given SourceMgr's lifetime).

this might seem to push us in a direction that i was considering (and i think we discussed somehwere?), where the sourceGateway effectively becomes more of a factory/intermediate layer that hangs on to the entire maybeSources set and multiplexes to children that are responsible for each individual URL, depending on which particular input pattern comes in.

(if we're thinking about that, one crucial reminder about the maybeSources model is that e.g. bitbucket abstracts maybeSources to allow for a project to be either git or hg - so we're encompassing in there the idea that ).

@@ -268,43 +258,38 @@ func newSourceGateway(maybe maybeSource, superv *supervisor, cachedir string) *s
return sg
}

func (sg *sourceGateway) addMaybeSources(mb maybeSources) {
sg.maybeMu.Lock()
sg.maybe = append(sg.maybe, mb...)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so we're conjoining gateways and caches for multiple maybeSources, then - only one ever active per SourceMgr? interesting. seems like that can work.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the big question i have (without having checked for myself) - is there a scenario where the user can get stuck if a repo exists on local disk for one scheme, but the credentials they'd previously configured to access that scheme no longer work/are missing? in such a case, we should ideally continue moving through the maybeSource options - is that where this PR moves us?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They should not get stuck. In that case, whenever sourceExistsUpstream is required and fails, another source would be chosen.

}

func (sg *sourceGateway) existsUpstream(ctx context.Context) bool {
func (sg *sourceGateway) existsUpstream(ctx context.Context) error {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm...i think i may want us to more broadly refactor these. i think (not quite sure yet) that this PR is taking us into territory where local can diverge from upstream. hand-in-hand with that is being a lot more careful about being able to inquire about current state without having the SourceMgr change it.

Copy link
Collaborator Author

@jmank88 jmank88 Oct 25, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well we have to allow divergence from upstream at some level to benefit from caching, but I hope I haven't done that unintentionally. Much of the language unfortunately becomes fuzzy. For example, when does 'exists' mean now and when does it mean within the cache window? I tried to be careful about these, and to document the sourceStates in a helpful way.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cool, that sounds reasonable for now, at least. but we're going to need to articulate this very, very clearly (i think i need to sketch a fuller FSM) if we have any hope of keeping track of this in a clear, understandable way.

// otherwise it returns a slice of all errors. If force is true, then each
// maybeSource may make a second attempt after removing the cache directory. testFn
// is an optional check which may access src and return additional sourceState.
func (sg *sourceGateway) setUpSource(ctx context.Context, force bool, testFn func(context.Context) (sourceState, error)) (sourceState, errorSlice) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we're going to re-expose these methods, we need to be diligent about not calling them from the outside. there was a rather huge mess back before i refactored to use the FSM, and much of it arose from mixing locks with intra-source object calls.

in fact, i'd probably be more comfortable if we separated these out by hiding the ones we expect SourceMgr itself to call (which should be mutex-protected) directly into an interface, and leaving the remainder unexposed.

existsCallsListVersions() bool
// listVersionsRequiresLocal returns true if calling listVersions first
// requires the source to exist locally.
listVersionsRequiresLocal() bool
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these seem fine for now. there's a refactoring in store for these, anyway, i think.

ImportRoot: string(pr),
Packages: make(map[string]pkgtree.PackageOrErr),
}
// Return a copy, with full import paths.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use PackageTree.Copy() here instead of re-rolling what's already there - even if it means another iteration pass over the packages.

Or, implement a new CopyWithRoot() func that does the same as here, then reimplement Copy() in terms of that.


// Run test twice so that we cover both the existing and non-existing case;
// only difference in results is the initial setup state.
t.Run("empty", do(sourceIsSetUp|sourceExistsUpstream|sourceHasLatestVersionList))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

haven't fully grokked changes to tests here yet, but i do look at the loss of these and worry that we're losing some of the meager set of state-changing cases that we do manage to cover now. no?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking back at this, it is still valuable to cover existing and non-existing cases (though the states will be the same). I'll restore that. Is there something else that is lost?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i don't think so, that's all that i was introducing with the second pass.

@jmank88 jmank88 changed the title gps: fine grained source transitions WIP: gps: fine grained source transitions Nov 8, 2017
@jmank88 jmank88 changed the title WIP: gps: fine grained source transitions gps: fine grained source transitions Dec 31, 2017
@jmank88
Copy link
Collaborator Author

jmank88 commented Dec 31, 2017

Rebased and ready for review again.

@jmank88
Copy link
Collaborator Author

jmank88 commented Jan 24, 2018

Rebased and infinitely queued on OSX build.

gps: source coord: set-up sources before returning gateways

gps: source cache: improve PackageTree ProjectRoot handling
@jmank88 jmank88 closed this Jan 31, 2018
@jmank88 jmank88 reopened this Jan 31, 2018
@jmank88
Copy link
Collaborator Author

jmank88 commented Jan 31, 2018

Green again.

strcount = strcount + len(poe.P.Imports) + len(poe.P.TestImports)
}
pool := make([]string, strcount)

for path, poe := range t.Packages {
for path, poe := range p {
var poe2 PackageOrErr

if poe.Err != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are drop all this? It's necessary to safely copy the various possible Err types.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was broken. The new TestPackageTreeCopy fails with the old logic - a build.NoGoError's Dir field is not copied.

Why do we need to copy them? Wouldn't it be a mistake to mutate an error anyways?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively, we could call .Error() and make a new error from the string. This is how the cache persists them.

gps/source.go Outdated
url = toFold(url)
src, st, err := m.try(ctx, sc.cachedir, sc.supervisor)
if err == nil {
srcGate = newSourceGateway(st, src, sc.supervisor, sc.cachedir)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We no longer defer trying the source. Instead we try up front, and the gateway gets a single fixed source.

There is more work while holding sc.srcmut, but I haven't noticed a slowdown.

@@ -104,7 +104,7 @@ type pathDeducer interface {
// So the return of the above string would be
// "github.com/some-user/some-package"
deduceRoot(string) (string, error)
deduceSource(string, *url.URL) (maybeSource, error)
deduceSource(string, *url.URL) (maybeSources, error)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using concrete maybeSources everywhere was more significant in a previous iteration, but is still an overall simplifying change - barring some test boilerplate.

@@ -21,34 +22,50 @@ import (
type sourceState int32

const (
sourceIsSetUp sourceState = 1 << iota
sourceExistsUpstream
// sourceExistsUpstream means the chosen source was verified upstream, during this execution.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sourceIsSetUp goes away, since it's implied by the existence of the gateway.

// treat it as a new not-yet-cloned repo.

// TODO(marwan-at-work): warn/give progress of the above comment.
os.RemoveAll(path)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was essentially inlined into each maybeSource, which allows them to directly create their respective vcs.Repos, rather than switching on the type, at the cost of bit of duplication. IMHO it is clearer, especially when trying to unravel the chain of similarly named types we create.

panic(fmt.Sprintf("Unrecognized format: %v", s))
}
// ensureCleaner is an optional extension of ctxRepo.
type ensureCleaner interface {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generalized from git specific, and used in baseVCSSource.updateLocal()

@@ -52,73 +48,6 @@ func TestErrs(t *testing.T) {
}
}

func TestNewCtxRepoHappyPath(t *testing.T) {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps these tests should be revived and adapted to the maybeSources.

@jmank88
Copy link
Collaborator Author

jmank88 commented Feb 21, 2018

primarily motivated by making sourceGateway state transitions more fine grained and special cases explicit, so that the persistent cache can be fully leveraged when integrated.

I wasn't very clear in the initial post, but I said it better in the changelog:

Reduce network access by trusting local source information and only pulling from upstream when necessary

Basically, by adding the persistent cache, what was previously an optimization by going to the network sooner might undermine the cache entirely in situations where it's no longer necessary to eventually go to the network.

@jmank88
Copy link
Collaborator Author

jmank88 commented Feb 22, 2018

@sdboyer Updated with a version of the changes discussed last night. Restoring those ctxRepo tests adapted for a maybeSource was going to be messy with the extra work that the try() methods were doing, so rather than adding the pings to each try, I generalized and moved the repeated checkLocal/ping logic into newSourceGateway. This separates some responsibilities more (try doesn't need a supervisor and becomes simpler; sourceGateway handles its own sourceState), and it allows us to leverage the proper sourceGateway.sourceExistsUpstream() method which handles the 'ls-remote instead of ping' style optimizations.

@jmank88
Copy link
Collaborator Author

jmank88 commented Feb 22, 2018

Here is a clean preview of the next step: https://github.com/jmank88/dep/compare/source_opt...jmank88:opt_cache

Copy link
Member

@sdboyer sdboyer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good. Onward and upward!

@sdboyer sdboyer merged commit 338675d into golang:master Feb 24, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Fix performance regression - network calls too early
4 participants