Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Centralize upstream sync infrastructure to the go-infra respository #9

Merged
merged 2 commits into from
Nov 4, 2021

Conversation

dagood
Copy link
Member

@dagood dagood commented Nov 2, 2021

This PR:

  • Moves code from microsoft/go to here:
  • Changes the code to reuse existing code from the Docker auto-update infrastructure.
    • Note that auto-update adds new commits to an existing PR, if it exists. Upstream sync uses a force push. (This is not necessarily a problem: these processes have different requirements.) This causes some of the remaining duplication.
  • Creates eng/sync-config.json to control the sync process.
  • Adds multi-repo capabilities, so a single schedule and build machine can sync both forks.
  • Adds the ability to push the PR branch either to the target repo or to the user's fork of the repo, depending on configuration. (Previously it assumed a fork.) This makes it work better with CI that expects the branch to be pushed to the microsoft GitHub org.

The first commit copies over the cmd/sync/sync.go file. The second commit does everything else. I recommend reviewing the second commit to see what changed with the switch to a centralized script that reuses existing code.

Example build: https://dev.azure.com/dnceng/internal/_build/results?buildId=1449982. It submitted PRs that looked fine and passed CI:

On the Docker auto-update side, I ran it on my repo multiple times to make sure it still behaves the same and updates existing PRs properly:

@dagood dagood force-pushed the dev/dagood/central-sync branch 2 times, most recently from 81f39b7 to 62117f4 Compare November 2, 2021 16:10
@dagood dagood marked this pull request as ready for review November 2, 2021 16:17
@dagood dagood requested a review from a team November 2, 2021 16:30
@dagood
Copy link
Member Author

dagood commented Nov 2, 2021

@microsoft/golang-compiler PTAL, this will start us syncing the Docker repo from upstream automatically.

buildmodel/commands.go Outdated Show resolved Hide resolved
@dagood dagood force-pushed the dev/dagood/central-sync branch from 62117f4 to d1758c9 Compare November 2, 2021 19:19
Copy link
Contributor

@qmuntal qmuntal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • An empty eng/go.mod has been added to the repo, is it necessary? If it is, add at least the module name to it so it becomes a valid mod file.
  • I would replace raw panic calls with log.Panic so the error message is logged in the cmd output. Using log.Fatal is also tempting, but deferred functions won't be called as it exists the program directly so the temporary directories won't be cleaned up.

// sortable path-friendly name to use as the name. This allows a command to run multiple times in
// sequence without overwriting or deleting the old data, for diagnostic purposes. This is intended
// as a dev convenience, not to enable running in parallel.
func GetTempPathInDir(rootDir string) (string, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This functions could be simplified by using os.MkdirTemp.

func GetTempPathInDir(rootDir string) (string, error) {
  pathDate := time.Now().Format("2006-01-02_15-04-05.000")
  return os.MkdirTemp(rootDir, fmt.Sprintf("%s_*", pathDate))
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it makes it safer overall, too, since it actually guarantees it won't pick the same directory. Can also remove some of the timestamp's precision since it's only useful for manually poking through these directories at that point.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't think to test this change in a clean environment, and it ended up failing in the pipeline:

https://dev.azure.com/dnceng/internal/_build/results?buildId=1455046&view=logs&j=4a0cec82-aaa7-5c07-9c1e-30cc88945a51&t=d35104de-7d40-5aa7-a628-491fd52b294b&l=13

stat /home/vsts/work/1/s/eng/artifacts/sync-upstream-temp-repo: no such file or directory

I think it's because MkdirTemp actually tries to create the directory, but at this point, the parent directory doesn't exist yet. It used to be that git init {temp path} was the first place the result of this func gets used, and git handles creating parent directories if necessary. MkdirTemp doesn't seem to.

My initial plan to fix it is to try to find a simple way to create any necessary chain of parent directories in Go. Working on it....

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cmd/sync/sync.go Outdated Show resolved Hide resolved
buildmodel/commands.go Outdated Show resolved Hide resolved
@dagood
Copy link
Member Author

dagood commented Nov 3, 2021

  • An empty eng/go.mod has been added to the repo, is it necessary? If it is, add at least the module name to it so it becomes a valid mod file.

Yeah, sorry, the reasoning is in the commit message:

Place empty go.mod in eng to ignore artifacts dir. The empty go.mod file makes "go build ./..." and similar commands ignore files in eng. This can be useful while developing changes to the "sync" utility, because it clones a temporary Go repository into eng/artifacts.

Maybe adding a comment would work for discoverability--upstream just recommends the empty file so I went with that:
golang/go#30058 (comment)
(Linked by a golang member, FWIW: golang/go#30058 (comment))

  • I would replace raw panic calls with log.Panic so the error message is logged in the cmd output.

Interesting, I'll have to try that. The errors have been showing up in the output for me, so will be interesting to see how this is different. I guess that if the panic is recovered, this makes the error show up anyway?

@qmuntal
Copy link
Contributor

qmuntal commented Nov 3, 2021

Yeah, sorry, the reasoning is in the commit message:

Place empty go.mod in eng to ignore artifacts dir. The empty go.mod file makes "go build ./..." and similar commands ignore files in eng. This can be useful while developing changes to the "sync" utility, because it clones a temporary Go repository into eng/artifacts.

Maybe adding a comment would work for discoverability--upstream just recommends the empty file so I went with that:
golang/go#30058 (comment)
(Linked by a golang member, FWIW: golang/go#30058 (comment))

Yep, a comment explaining the reasoning would be great.

Interesting, I'll have to try that. The errors have been showing up in the output for me, so will be interesting to see how this is different. I guess that if the panic is recovered, this makes the error show up anyway?

You are right, the error is printed before throwing a panic. In addition, the error message will be logged to the output specified by log.SetOutput -defaults to stderr- and any flag and prefix configured will also by applied. Probably not relevant for this repo, but a good practice anyway 😄.

Add template for microsoft-golang-bot "git config" and Go toolset init. Update to 1.17.2.

Place empty go.mod in eng to ignore artifacts dir.

Use "os.MkdirTemp" for "GetWorkPathInDir" func.

Use "log.Panic" instead of "panic".
@dagood dagood force-pushed the dev/dagood/central-sync branch from d1758c9 to 10543eb Compare November 3, 2021 22:46
@dagood dagood requested a review from qmuntal November 3, 2021 22:46
Copy link
Contributor

@qmuntal qmuntal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amazing!

@dagood dagood merged commit 68061e8 into microsoft:main Nov 4, 2021
@dagood dagood deleted the dev/dagood/central-sync branch November 4, 2021 17:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants