Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Let the publishing robot publish k8s.io/apimachinery and k8s.io/client-go #1784

Conversation

caesarxuchao
Copy link
Member

@caesarxuchao caesarxuchao commented Feb 3, 2017

TODO (follow-up):

  1. create labels for publish failure issues.
  2. Let the publishing robot publish k8s.io/apimachinery and k8s.io/client-go #1784 (comment)

The robot first publishes k8s.io/apimachinery, then publishes k8s.io/client-go. In the meantime, the publishing robot lets client-go vendor the just published k8s.io/apimachinery.

We need to merge kubernetes/kubernetes#40909 first.

Sample output:
The commits after "CHAO: starting commit of the test" what the robot published. The robot took k8s.io/kubernetes/staging as the source of truth.

apimachinery:
https://github.com/caesarxuchao/apimachinery/commits/master

client-go master:
https://github.com/caesarxuchao/client-go/commits/master

client-go release-2.0:
https://github.com/caesarxuchao/client-go/commits/release-2.0

@lavalamp @sttts @deads2k @mml

I haven't figured out how to write an effective test for the robot.

previousKubeSHA=$(cat kubernetes-sha)
previousBranchSHA=$(cat filter-branch-sha)

# hack...
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's this? Were the commits still wrong somehow?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a comment why this hack is here and when it can go.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll remove those before merging. Because my test environment was setup at that point, so the hack it needed for my local experiment.

@@ -0,0 +1,90 @@
#!/bin/bash
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was there no reasonable way to get the sync script from the repo? In openshift we ran two repos for a long time (still do for some things), but it does make it harder to describe where people can find, modify, and test changes.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can get the script from the repo, but I think it's better to centralize all the publishing logic in test-infra. If a developer is going to change the publishing logic, probably he'll need to update the scripts and other parts of the robot at the same time. The workflow is much easier if all the pieces are in the same repo.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can get the script from the repo, but I think it's better to centralize all the publishing logic in test-infra. If a developer is going to change the publishing logic, probably he'll need to update the scripts and other parts of the robot at the same time. The workflow is much easier if all the pieces are in the same repo.

Thing is, no one but a googler can actually check on this script. That really limits the pool of people who can contribute.

We're slipping pretty far out of date (need a sync pretty badly) so I wouldn't block on it, but I think this will make it harder to make further improvements.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't follow the argument. I think what would impede contribution is that the cluster is running in a google provided cluster, not the location of the scripts.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree w/ Chao

@@ -46,6 +49,10 @@ func (c coordinate) String() string {
type repoRules struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there someone more familiar with what these pieces do?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mml is familiar with this file. @mml could you help? Thanks.

@deads2k
Copy link
Contributor

deads2k commented Feb 3, 2017

I think @sttts moved the last of the genericapiserver packages last night, but we'll have to publish the chain to allow the dependent projects to godep them. Will it mess things up to manually publish again? Do you think this is close enough we can just wait?

git config --global user.email "k8s-publish-robot@users.noreply.github.com"
git config --global user.name "Kubernetes Publisher"

dir=$(mktemp -d "${TMPDIR:-/tmp/}$(basename 0).XXXXXXXXXXXX")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

basename $0

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

git fetch upstream-kube

currBranch=$(git rev-parse --abbrev-ref HEAD)
previousKubeSHA=$(cat kubernetes-sha)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be nice annotate the client-go commits with the kube counterpart commit. Not sure this is easy to do.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

X-Kubernetes-Commit: 9483205802394850293452134234

There is git filter-branch --msg-filter .... for that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll integrate this suggestion later. I don't want to spend too much time on this, sorry.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

follow-up is fine.

fi

git branch -D kube-sync || true
git checkout upstream-kube/master -b kube-sync
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about releases? 1.6, 1.7....

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about releases? 1.6, 1.7....

Env var I'd think.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Releases can be set up in the repoRules in publisher.go and can be passed to this script.

k8s-github-robot pushed a commit to kubernetes/kubernetes that referenced this pull request Feb 3, 2017
Automatic merge from submit-queue (batch tested with PRs 40862, 40909)

Remove apimachinery from staging client-go/Godeps/Godeps.json

The publishing robot will add the latest version of apimachinery to Godeps.json.

This is part of the effort to allow update staging apimachinery and staging client-go in a same PR.

The robot change is here: kubernetes/test-infra#1784

@deads2k @stts @lavalamp
@deads2k
Copy link
Contributor

deads2k commented Feb 3, 2017

@lavalamp were you familiar with this before? neither sttts nor myself can actually see previous runs or actually merge this.

@sttts
Copy link
Contributor

sttts commented Feb 3, 2017

Was chatting with @caesarxuchao earlier today about seeing the logs. The bot pod runs on some google internal GKE instance. To make the logs visible we could use some GCE cloud storage. Maybe it would be much more elegant though to integrate with Github instead and create an issue if the merge breaks (appending new comments, if it breaks again the next day). @caesarxuchao wants to look into this tomorrow or on monday.

ENV PATH="/usr/local/go/bin:${PATH}"

ENV GOPATH=/
RUN go get github.com/tools/godep
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pin to version, maybe?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Pinned to v75.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have v79, v75, v74 pinned "somewhere".

@caesarxuchao
Copy link
Member Author

@sttts, the last commit added the ability to create an issue if error occurs during a publisher run. Here are a few things I want to discuss:

  • Shall the issue be created in k8s.io/kubernetes or k8s.io/test-infra, currently it's the former.
  • I couldn't find an easy way to extract the log starting from the last run. Currently I just print the last 15,000 bytes from the log file. Do you have any suggestion?

@foxish could you help review the last commit? I think you are familiar with the IssueCacher and IssueSyncer. Thanks.

glog.Flush()
// maxLogLength is the estimated number of characters of the log created
// in each run of the publisher
var maxLogLength = int64(15000)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively, we could have some marker line and search for that. Or is there some log rotation in glog?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I thought of the marker as well, I'll try it.

glog has rotation, when the log file size exceeds 1800MB. But there's no public API to manually trigger it.

@sttts
Copy link
Contributor

sttts commented Feb 6, 2017

@caesarxuchao both kubernetes or test-infra would be fine. I tend to the later.

Can you pass a list of @xyz like github accounts which are cc'ed to the issue?

@caesarxuchao
Copy link
Member Author

Can you pass a list of @xyz like github accounts which are cc'ed to the issue?

Yeah, we can. Whom to include? Starts with you, me, deads2k, lavalamp?

@sttts
Copy link
Contributor

sttts commented Feb 7, 2017

@caesarxuchao Make a github group out of and add the 4 of us there. @kubernetes/kubernetes-staging-publish-cops

@caesarxuchao
Copy link
Member Author

@foxish, the issue-cache is never synced. I think I missed some pieces. Could you help take a look at the third commit? Thanks.

@k8s-reviewable
Copy link

This change is Reviewable

@foxish
Copy link
Contributor

foxish commented Feb 8, 2017

@caesarxuchao, if you're trying to create comments, you may want to reuse the pattern that the approval-munger uses, such as here. Adding @grodrigues3 and @apelisse who wrote a lot of the new stuff with regard to adding/deleting comments.

@caesarxuchao
Copy link
Member Author

@foxish No, it's not about creating comments. I need the robot to create an issue or update the issue if it's not closed. I think issue-syncer and issue-cacher are the right module to use.

@caesarxuchao
Copy link
Member Author

update: need a little more time tmr to fix the third commit.

@caesarxuchao
Copy link
Member Author

@sttts I met some problems when trying to export the log file created by glog. Although I passed --log_dir to set the default location of the log file, the log still ends up in a random file in the /tmp dir. This is because flags are parsed in main(), but the first invocation of glog is in init(), and upon its first invocation, glog creates the log file. I'm looking for a workaround.

@sttts
Copy link
Contributor

sttts commented Feb 9, 2017

If nothing helps, move flags parsing into the init func. Not nice, but might work.

@caesarxuchao caesarxuchao force-pushed the publish-apimachinery-and-client-go branch from f1789b2 to d325d58 Compare February 9, 2017 14:22
@@ -70,6 +70,7 @@ func (p *IssueCacher) RequiredFeatures() []string { return []string{} }

// Initialize will initialize the munger
func (p *IssueCacher) Initialize(config *github.Config, features *features.Features) error {
// TODO: this need to be changed
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comments like this are more useful if they state what about it needs to be changed :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I forgot to remove this comment I left during debugging.

@@ -43,6 +44,16 @@ if git diff --cached --exit-code &>/dev/null; then
exit 0
fi
git commit -m "${MESSAGE}"

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a comment describing why the next section is needed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added.

if git diff --cached --exit-code &>/dev/null; then
echo "dependency has not changed!"
else
git commit -m "update dependency, should only contain changes in k8s.io/apimachinery"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Pick up new dependencies"?

I don't recommend including references to specific packages when it looks like the above could have updated lots of stuff.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The staging area should have the latest dependencies, except for the k8s repos, like apimachinery.

I'll rephrase to "Pick up new dependencies on other k8s repos".

@lavalamp
Copy link
Member

lavalamp commented Feb 9, 2017

Sorry for delay in paying attention to this, I was in all day meetings last two days.

So, I have to admit I'm a little lost in the layers of automation here. I think I'd like us to do this:

  1. Turn off the current publishing bot, since I think it's an auto-breakage bot right now.
  2. Run this script, but instead of publishing directly, publish to Chao's fork, and make a PR.
  3. The travis instance I turned on should test the PR, we can merge when it passes.
  4. Hm, we should copy the travis.yml file to other branches, too.
  5. Repeat the publishing process (steps 2-3) for every branch.

At that point, we should have a functioning client library, and I want to pause and regroup. Before we turn the automation back on, I want it to be able to test that the thing it's going to push actually works and will not break users.

I've started a client strategy doc here so we can all get on the same page. Anyone in the api machinery sig mailing list should have access: https://docs.google.com/document/d/1h_IBGYPMa8FS0oih4NbVkAMAzM7YTHr76VBcKy1qFbg/edit

@caesarxuchao
Copy link
Member Author

Run this script, but instead of publishing directly, publish to Chao's fork, and make a PR.

If we want to do this, I'll need to submit a PR to apimachinery as well, it sounds complicated.

Alternatively, @deads2k @sttts if you can manually sync the apimachinery repo soon, I can manually fix client-go's master branch. How's that?

For client-go release-2.0 branch, the robot is doing the right thing. I'll wait for it to pick up my latest cherrypicks to the kubernetes release-1.5 branch, then disable the bot.

For release-1.4 and release-1.5, because they are not tracking any kubernetes branch, only manual fixes are possible.

I want it to be able to test that the thing it's going to push actually works and will not break users.

+100. How about letting the bot compile the client-go and run the unit tests before publishing?

@caesarxuchao
Copy link
Member Author

caesarxuchao commented Feb 13, 2017

I created the kubernetes-staging-publish-cops team and included @lavalamp @deads2k, @sttts and myself. This team will be notified if the robot fails to publish the staging folder to repos.

@caesarxuchao
Copy link
Member Author

caesarxuchao commented Feb 13, 2017

I'm verifying the code generated by the robot in kubernetes/client-go#103 and kubernetes/client-go#104. (The signal is if the travis test passes)

update: both travis tests have passed.

@caesarxuchao
Copy link
Member Author

@lavalamp, the robot is generating sane code. I pushed the last commit which let the robot run go build and go testbefore publishing client-go. Please let me if there are other fundamental issues with this PR. I'll address the rest comments tomorrow.

@caesarxuchao caesarxuchao force-pushed the publish-apimachinery-and-client-go branch 2 times, most recently from 7ec546b to dd4b80a Compare February 14, 2017 07:21
@caesarxuchao
Copy link
Member Author

Comments addressed. PTAL. Thanks.

Copy link
Member

@lavalamp lavalamp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm really sorry it took so long for me to find time to look at this.

I wonder if we can do this in three steps:

  1. We get the shell scripts set up and runable by a human.
  2. We get the publisher running them automatically
  3. We add issue filing.

I wonder if we can make the scripts publish PRs instead of pushing directly. That would be much safer?

curDir, err := os.Getwd()
if err != nil {
glog.Infof("Getwd failed")
p.plog.Infof("Getwd failed")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should actually print out the error in all of these, so people will know what to do.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

limitations under the License.
*/

// Changing glog output directory via --log_dir doesn't work, because the flag
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we just record to a bytes.Buffer or some such, print to the logs when we're done? I'm not sure we need this whole concept.

return nil
}

func (p *publisherIssueTracker) FileIssue(failure publisherFailure) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we put filing issues into a separate PR?

NETRCDIR="${2}"

# set up github token
echo "machine github.com login ${TOKEN}" > "${NETRCDIR}"/.netrc
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do you need a directory for this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed.


# set up github token
echo "machine github.com login ${TOKEN}" > "${NETRCDIR}"/.netrc
rm -f ~/.netrc
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if they need this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

if [ ! $# -eq 5 ]; then
echo "usage: publish.sh destination_dir destination_branch token netrc_dir commit_message. destination_dir and netrc_dir are expected to be absolute paths."
if [ ! $# -eq 6 ]; then
echo "usage: publish.sh destination_dir destination_branch token netrc_dir commit_message gopath. destination_dir and netrc_dir are expected to be absolute paths."
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why take a gopath arg? Can we make a tmp dir instead?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And if not, why not expect $GOPATH to just be set correctly?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't matter now, no GOPATH needed.

@@ -43,6 +44,23 @@ if git diff --cached --exit-code &>/dev/null; then
exit 0
fi
git commit -m "${MESSAGE}"

# Run "godep restore" to restore dependencies. Because entries for
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prefix with "client-go's /vendor directory in staging doesn't include the other k8s.io/... dependencies, specifically apimachinery. Therefore, we do a restore/save cycle to fix this before publishing to the client repo.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason why we have to do this here and can't just fix the vendor directory in the staging directory?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

vendor/ in staging directory is fixed now, so i'm going to rewrite this part of code to simply replace the SHA of k8s.io/apimachinery in Godeps.json, but not update vendor/. Then the script doesn't need to go through the godep save/restore.

Copy link
Member Author

@caesarxuchao caesarxuchao Feb 24, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixing this now. Depends on kubernetes/kubernetes#42084.

@@ -23,6 +23,8 @@ spec:
- --repo-dir=$(REPO_DIR)
- --netrc-dir=$(NETRC_DIR)
- --alsologtostderr
- --publisher-log-dir=$(PUBLISHER_LOG_DIR)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recommend storing the log in memory instead of needing a separate file.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@@ -1,5 +1,5 @@
/*
Copyright 2016 The Kubernetes Authors.
Copyright 2017 The Kubernetes Authors.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't change years.

// EachLoop is called at the start of every munge loop
func (p *PublisherMunger) EachLoop() error {
// initialize the issueTracker in EachLoop, in case there is a new issue created in last EachLoop
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm really super nervous about having this much untested code.

@sttts
Copy link
Contributor

sttts commented Feb 22, 2017

What is the status here? How can we help to finish this?

@caesarxuchao
Copy link
Member Author

I'm working on addressing the comments.

@caesarxuchao
Copy link
Member Author

Status update:
I'm refactoring client-go to use the filter-branch magic, this will save the commit history, and largely clean up the go code. After this is done, I'll open a PR with only the publish_client_go.sh and publish_apimachinery, and verify the scripts are good.

Chao Xu added 4 commits February 27, 2017 13:51
k8s.io/client-go. In the meantime, the publishing robot lets client-go
vendor the just published k8s.io/apimachinery.

add the ability to create an issue if publish fails

don't delete .travis.yml

go build && go test before publish client-go
@caesarxuchao caesarxuchao force-pushed the publish-apimachinery-and-client-go branch from d2b06d3 to 833de84 Compare February 28, 2017 23:23
@k8s-ci-robot
Copy link
Contributor

@caesarxuchao: The following test(s) failed:

Test name Commit Details Rerun command
Bazel test 833de84 link @k8s-bot bazel test this

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants