Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Return errors when http/grpc servers exit. #92

Merged
merged 3 commits into from
Jul 27, 2018

Conversation

gouthamve
Copy link
Contributor

@gouthamve gouthamve commented Apr 2, 2018

There was a case where the HTTP server died but the process exited. We
need to exit when the servers exit.

Quirky that there are unused imports before, what am I missing?


This change is Reviewable

@gouthamve
Copy link
Contributor Author

Didn't run dep ensure and go-imports automatically removed the imports. Explicitly added the alias so that that doesn't happen again.

server/server.go Outdated
"time"

"github.com/gorilla/mux"
"github.com/grpc-ecosystem/grpc-opentracing/go/otgrpc"
"github.com/mwitkow/go-grpc-middleware"
grpc_middleware "github.com/mwitkow/go-grpc-middleware"

This comment was marked as abuse.

server/server.go Outdated
go func() {
if err := s.httpServer.Serve(s.httpListener); err != http.ErrServerClosed {
log.Errorf("error serving http: %v", err)
os.Exit(1)

This comment was marked as abuse.

@bboreham
Copy link
Collaborator

Can you provide a bit more context on what went wrong?

@bboreham
Copy link
Collaborator

Maybe change it so that Run() returns with an error if one of the serving loops does?

@bboreham
Copy link
Collaborator

@gouthamve are you likely to come back to this?

@gouthamve
Copy link
Contributor Author

gouthamve commented Jun 21, 2018 via email

@gouthamve gouthamve changed the title Log errors and quit when http/grpc servers exit. Return errors when http/grpc servers exit. Jul 9, 2018
@gouthamve
Copy link
Contributor Author

Extremely sorry for the delay. So what I observed was that the http server shutdown but the service was still running. After exec'ing in, I saw that the port was not bound essentially saying the http server exited.

I think we should catch those errors and not just ignore them.

@bboreham
Copy link
Collaborator

bboreham commented Jul 9, 2018

Looking much better. Any chance of a test?

@gouthamve gouthamve force-pushed the log-errors branch 2 times, most recently from c0cc6be to 32a63b3 Compare July 10, 2018 06:55
@gouthamve
Copy link
Contributor Author

@bboreham Done. Please take a look.

server/server.go Outdated
}
}
}()
defer s.httpServer.Close()

This comment was marked as abuse.

This comment was marked as abuse.

@tomwilkie
Copy link
Contributor

Just seen this again (HTTP server randomly stopping serving traffic).

Copy link
Collaborator

@bboreham bboreham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is supposed to happen on http.ErrServerClosed or grpc.ErrServerStopped - looks like function doesn't return?

server/server.go Outdated
@@ -161,17 +161,38 @@ func RegisterInstrumentation(router *mux.Router) {
}

// Run the server; blocks until SIGTERM is received.
func (s *Server) Run() {
go s.httpServer.Serve(s.httpListener)
func (s *Server) Run() error {

This comment was marked as abuse.

@gouthamve gouthamve force-pushed the log-errors branch 2 times, most recently from 1990210 to 53a750b Compare July 19, 2018 12:32
@gouthamve
Copy link
Contributor Author

gouthamve commented Jul 19, 2018

You're right. Nice, catch. I've made sure the function now returns.

I noticed a race-condition now.... So if we send a SIGTERM, or an error is returned, right after calling Run(), the function never exits. This is the reason there is a Sleep in the tests, I now see that it can cause some problems.

For example, if the HTTP server has a closed connection, it'll immediately exit, but by that time, we might not have evaluated return <-errChan, hence the error will fall into the default section. And Run() will never exit.

Seems unpractically rare (tests failing w/o sleep points otherwise), but curious about your thoughts.

@bboreham
Copy link
Collaborator

I can't see how an error return would not result in exiting the function.
SIGTERM could arrive before we started listening for it, and if we're pid 1 would do nothing. So move the signal listen up to the top of the function to reduce the window.

@gouthamve
Copy link
Contributor Author

I can't see how an error return would not result in exiting the function.

It's extremely counterintuitive, but removing this sleep, would make the test fail: https://github.com/weaveworks/common/pull/92/files#diff-d5666a18e51cbff4c4f482faa6f9f449R114

In that case, the error falls into the default: case here: https://github.com/weaveworks/common/pull/92/files#diff-91bbeda7eb98a7adc57b9e47e2cf5c2bR190

I don't have a good explanation as to why that is happening, but I can only guess that the return <-errChan has not been evaluated yet, before the go-routine has begun to run.

The side effect to this, is if the server exits immediately, there is a chance that the server exits immediately but Run() doesn't quit.

There was a case where the HTTP server died but the process exited. We
need to exit when the servers exit.

Signed-off-by: Goutham Veeramachaneni <gouthamve@gmail.com>
@bboreham
Copy link
Collaborator

I still have to look into your race condition, but there is a merge conflict you might like to fix in the meantime.

@gouthamve
Copy link
Contributor Author

@bboreham Done.

@bboreham
Copy link
Collaborator

I can't run TestRunReturnsError in my standard config because both sub-tests run in parallel and they both use the same ports.

@bboreham
Copy link
Collaborator

Running them one at a time, I can't get it to fail by taking the sleep out.

Signed-off-by: Goutham Veeramachaneni <gouthamve@gmail.com>
@gouthamve
Copy link
Contributor Author

@bboreham Done. And made sure the test doesn't fail when run.

Copy link
Collaborator

@bboreham bboreham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great! Good improvement to the functionality, and thanks for taking the extra time over the tests.

@bboreham bboreham merged commit b645195 into weaveworks:master Jul 27, 2018
tomwilkie pushed a commit to tomwilkie/weaveworks-common that referenced this pull request Sep 13, 2018
35679ee Merge pull request weaveworks#110 from weaveworks/parallel-push-errors
3ae41b6 Remove unneeded if block
51ff31a Exit on first error
0faad9f Check for errors when pushing images in parallel
74dc626 Merge pull request weaveworks#108 from weaveworks/disable-apt-daily
b4f1d91 Merge pull request weaveworks#107 from weaveworks/docker-17-update
7436aa1 Override apt daily job to not run immediately on boot
7980f15 Merge pull request weaveworks#106 from weaveworks/document-docker-install-role
f741e53 Bump to Docker 17.06 from CE repo
61796a1 Update Docker CE Debian repo details
0d86f5e Allow for Docker package to be named docker-ce
065c68d Document selection of Docker installation role.
3809053 Just --porcelain; it defaults to v1
11400ea Merge pull request weaveworks#105 from weaveworks/remove-weaveplugin-remnants
b8b4d64 remove weaveplugin remnants
35099c9 Merge pull request weaveworks#104 from weaveworks/pull-docker-py
cdd48fc Pull docker-py to speed tests/builds up.
e1c6c24 Merge pull request weaveworks#103 from weaveworks/test-build-tags
d5d71e0 Add -tags option so callers can pass in build tags
8949b2b Merge pull request weaveworks#98 from weaveworks/git-status-tag
ac30687 Merge pull request weaveworks#100 from weaveworks/python_linting
4b125b5 Pin yapf & flake8 versions
7efb485 Lint python linting function
444755b Swap diff direction to reflect changes required
c5b2434 Install flake8 & yapf
5600eac Lint python in build-tools repo
0b02ca9 Add python linting
c011c0d Merge pull request weaveworks#79 from kinvolk/schu/python-shebang
6577d07 Merge pull request weaveworks#99 from weaveworks/shfmt-version
00ce0dc Use git status instead of diff to add 'WIP' tag
411fd13 Use shfmt v1.3.0 instead of latest from master.
0d6d4da Run shfmt 1.3 on the code.
5cdba32 Add sudo
c322ca8 circle.yml: Install shfmt binary.
e59c225 Install shfmt 1.3 binary.
30706e6 Install pyhcl in the build container.
960d222 Merge pull request weaveworks#97 from kinvolk/alban/update-shfmt-3
1d535c7 shellcheck: fix escaping issue
5542498 Merge pull request weaveworks#96 from kinvolk/alban/update-shfmt-2
32f7cc5 shfmt: fix coding style
09f72af lint: print the diff in case of error
571c7d7 Merge pull request weaveworks#95 from kinvolk/alban/update-shfmt
bead6ed Update for latest shfmt
b08dc4d Update for latest shfmt (weaveworks#94)
2ed8aaa Add no-race argument to test script (weaveworks#92)
80dd78e Merge pull request weaveworks#91 from weaveworks/upgrade-go-1.8.1
08dcd0d Please ./lint as shfmt changed its rules between 1.0.0 and 1.3.0.
a8bc9ab Upgrade default Go version to 1.8.1.
31d069d Change Python shebang to `#!/usr/bin/env python`

git-subtree-dir: tools
git-subtree-split: 35679ee5ff17c4edf864b7c43dc70a40337fcd80
bboreham added a commit that referenced this pull request May 15, 2019
d6cc704 Fix comment
7139116 Revert "Push comments to the left so they don't appear in scripts"
e47e58f Push comments to the left so they don't appear in scripts
3945fce Remove nonexistent env var GIT_TAG
cd62992 Merge pull request #156 from weaveworks/drop-quay
af0eb51 Merge pull request #157 from weaveworks/fix-image-tag-prefix-length
0b9aee4 Fix image-tag object name prefix length to 8 chars.
813c28f Move from CircleCI 1.0 to 2.0
425cf4e Move from quay.io to Dockerhub
87ccf4f Merge pull request #155 from weaveworks/go-1-12
c31bc28 Update lint script to work with Go 1.12
ed8e380 Update to Go 1.12.1
ec369f5 Merge pull request #153 from dholbach/drop-email
ef7418d weave-users mailing list is closed: https://groups.google.com/a/weave.works/forum/#!topic/weave-users/0QXWGOPdBfY
6954a57 Merge pull request #144 from weaveworks/golang-1.11.1
9649eed Upgrade build image from golang:1.10.0-strech to 1.11.1-strech
59263a7 Merge pull request #141 from weaveworks/update-context
e235c9b Merge pull request #143 from weaveworks/gc-wks-test-vms
c865b4c scheduler: please lint/flake8
da61568 scheduler: please lint/yapf
ce9d78e scheduler: do not cache discovery doc
e4b7873 scheduler: add comment about GCP projects' IAM roles needed to list/delete instances and firewall rules
ff7ec8e scheduler: add comment about CircleCI projects' access via the API
2477d98 scheduler: deploy command now sets the current datetime as the version
5fcd880 scheduler: pass CircleCI API token in for private projects
6b8c323 scheduler: more details in case of failure to get running builds from CircleCI
0871aff scheduler: downgrade google-api-python-client from 1.7.4 to 1.6.7
b631e7f scheduler: add GC of WKS test VMs and firewall rules
a923a32 scheduler: document setup and deployment
013f508 scheduler: lock dependencies' versions
6965a4a Merge pull request #142 from weaveworks/fix-build
23298c6 Fix golint expects import golang.org/x/lint/golint
482f4cd Context is now part of the Go standard library
2bbc9a0 Merge pull request #140 from weaveworks/sched-http-retry
c3726de Add retries to sched util http calls
2cc7b5a Merge pull request #139 from meghalidhoble/master
fd9b0a7 Change : Modified the lint tools to skip the shfmt check if not installed. Why the change : For ppc64le the specific version of shfmt is not available, hence skipped completely the installation of shfmt tool. Thus this change made.
bc645c7 Merge pull request #138 from dholbach/add-license-file
a642e02 license: add Apache 2.0 license text
9bf5956 Merge pull request #109 from hallum/master
d971d82 Merge pull request #134 from weaveworks/2018-07-03-gcloud-regepx
32e7aa2 Merge pull request #137 from weaveworks/gcp-fw-allow-kube-apiserver
bbb6735 Allow CI to access k8s API server on GCP instances
764d46c Merge pull request #135 from weaveworks/2018-07-04-docker-ansible-playbook
ecc2a4e Merge pull request #136 from weaveworks/2018-07-05-gcp-private-ips
209b7fb tools: Add private_ips to the terraform output
369a655 tools: Add an ansible playbook that just installs docker
a643e27 tools: Use --filter instead of --regexp with gcloud
b8eca88 Merge pull request #128 from weaveworks/actually-say-whats-wrong
379ce2b Merge pull request #133 from weaveworks/fix-decrypt
3b906b5 Fix incompatibility with recent versions of OpenSSL
f091ab4 Merge pull request #132 from weaveworks/add-opencontainers-labels-to-dockerfiles
248def1 Inject git revision in Dockerfiles
64f2c28 Add org.opencontainers.image.* labels to Dockerfiles
ea96d8e add information about how to get help (#129)
f066ccd Make yapf diff failure look like an error
34d81d7 Merge pull request #127 from weaveworks/golang-1.10.0-stretch
89a0b4f Use golang:1.10.0-stretch image.
ca69607 Merge pull request #126 from weaveworks/disable-apt-daily-test
f5dc5d5 Create "setup-apt" role
7fab441 Rename bazel to bazel-rules (#125)
ccc8316 Revert "Gocyclo should return error code if issues detected" (#124)
1fe184f Bazel rules for building gogo protobufs (#123)
b917bb8 Merge pull request #122 from weaveworks/fix-scope-gc
c029ce0 Add regex to match scope VMs
0d4824b Merge pull request #121 from weaveworks/provisioning-readme-terraform
5a82d64 Move terraform instructions to tf section
d285d78 Merge pull request #120 from weaveworks/gocyclo-return-value
76b94a4 Do not spawn subshell when reading cyclo output
93b3c0d Use golang:1.9.2-stretch image
d40728f Gocyclo should return error code if issues detected
c4ac1c3 Merge pull request #114 from weaveworks/tune-spell-check
8980656 Only check files
12ebc73 Don't spell-check pki files
578904a Special-case spell-check the same way we do code checks
e772ed5 Special-case on mime type and extension using just patterns
ae82b50 Merge pull request #117 from weaveworks/test-verbose
8943473 Propagate verbose flag to 'go test'.
7c79b43 Merge pull request #113 from weaveworks/update-shfmt-instructions
258ef01 Merge pull request #115 from weaveworks/extra-linting
e690202 Use tools in built image to lint itself
126eb56 Add shellcheck to bring linting in line with scope
63ad68f Don't run lint on files under .git
51d908a Update shfmt instructions
e91cb0d Merge pull request #112 from weaveworks/add-python-lint-tools
0c87554 Add yapf and flake8 to golang build image
35679ee Merge pull request #110 from weaveworks/parallel-push-errors
3ae41b6 Remove unneeded if block
51ff31a Exit on first error
0faad9f Check for errors when pushing images in parallel
d87cd02 Add arg flag override for destination socks host:port in pacfile.
74dc626 Merge pull request #108 from weaveworks/disable-apt-daily
b4f1d91 Merge pull request #107 from weaveworks/docker-17-update
7436aa1 Override apt daily job to not run immediately on boot
7980f15 Merge pull request #106 from weaveworks/document-docker-install-role
f741e53 Bump to Docker 17.06 from CE repo
61796a1 Update Docker CE Debian repo details
0d86f5e Allow for Docker package to be named docker-ce
065c68d Document selection of Docker installation role.
3809053 Just --porcelain; it defaults to v1
11400ea Merge pull request #105 from weaveworks/remove-weaveplugin-remnants
b8b4d64 remove weaveplugin remnants
35099c9 Merge pull request #104 from weaveworks/pull-docker-py
cdd48fc Pull docker-py to speed tests/builds up.
e1c6c24 Merge pull request #103 from weaveworks/test-build-tags
d5d71e0 Add -tags option so callers can pass in build tags
8949b2b Merge pull request #98 from weaveworks/git-status-tag
ac30687 Merge pull request #100 from weaveworks/python_linting
4b125b5 Pin yapf & flake8 versions
7efb485 Lint python linting function
444755b Swap diff direction to reflect changes required
c5b2434 Install flake8 & yapf
5600eac Lint python in build-tools repo
0b02ca9 Add python linting
c011c0d Merge pull request #79 from kinvolk/schu/python-shebang
6577d07 Merge pull request #99 from weaveworks/shfmt-version
00ce0dc Use git status instead of diff to add 'WIP' tag
411fd13 Use shfmt v1.3.0 instead of latest from master.
0d6d4da Run shfmt 1.3 on the code.
5cdba32 Add sudo
c322ca8 circle.yml: Install shfmt binary.
e59c225 Install shfmt 1.3 binary.
30706e6 Install pyhcl in the build container.
960d222 Merge pull request #97 from kinvolk/alban/update-shfmt-3
1d535c7 shellcheck: fix escaping issue
5542498 Merge pull request #96 from kinvolk/alban/update-shfmt-2
32f7cc5 shfmt: fix coding style
09f72af lint: print the diff in case of error
571c7d7 Merge pull request #95 from kinvolk/alban/update-shfmt
bead6ed Update for latest shfmt
b08dc4d Update for latest shfmt (#94)
2ed8aaa Add no-race argument to test script (#92)
80dd78e Merge pull request #91 from weaveworks/upgrade-go-1.8.1
08dcd0d Please ./lint as shfmt changed its rules between 1.0.0 and 1.3.0.
a8bc9ab Upgrade default Go version to 1.8.1.
31d069d Change Python shebang to `#!/usr/bin/env python`

git-subtree-dir: tools
git-subtree-split: d6cc704a2892e8d85aa8fa4d201c1a404f02dfa4
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants