Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API server panics when writing response #29001

Closed
gmarek opened this issue Jul 15, 2016 · 50 comments
Closed

API server panics when writing response #29001

gmarek opened this issue Jul 15, 2016 · 50 comments
Assignees
Labels
kind/bug Categorizes issue or PR as related to a bug. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery.
Milestone

Comments

@gmarek
Copy link
Contributor

gmarek commented Jul 15, 2016

When I was running large scale kubemark test (2000 Nodes with 95 pods each) my API server crashed with the error:

I0715 12:14:23.387809    3511 trace.go:61] Trace "List /api/v1/namespaces/e2e-tests-density-idftq/pods" (started 2016-07-15 12:13:01.712298253 +0000 UTC):
[102.412µs] [102.412µs] About to List from storage
[2.53469971s] [2.534597298s] Listing from storage done
[2.954350982s] [419.651272ms] Self-linking done
[1m21.675466193s] [1m18.721115211s] Writing http response done (130003 items)
[1m21.675467856s] [1.663µs] END
E0715 12:14:23.393199    3511 apiserver.go:210] recover from panic situation: - Header called after Handler finished^M
    /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/github.com/emicklei/go-restful/container.go:206^M
    /usr/local/go/src/runtime/asm_amd64.s:472^M
    /usr/local/go/src/runtime/panic.go:443^M
    /usr/local/go/src/net/http/h2_bundle.go:4452^M
    /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/httplog/log.go:170^M
    /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/apiserver/handlers.go:238^M
    <autogenerated>:8^M
    /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/apiserver/metrics/metrics.go:102^M
    /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/github.com/emicklei/go-restful/container.go:272^M
    /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/github.com/emicklei/go-restful/container.go:120^M
    /usr/local/go/src/net/http/server.go:1618^M
    /usr/local/go/src/net/http/server.go:1910^M
    /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/apiserver/handlers.go:485^M
    /usr/local/go/src/net/http/server.go:1618^M
    /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/apiserver/handlers.go:420^M
    /usr/local/go/src/net/http/server.go:1618^M
    /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/auth/handlers/handlers.go:66^M
    /usr/local/go/src/net/http/server.go:1618^M
    /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/api/requestcontext.go:101^M
    /usr/local/go/src/net/http/server.go:1618^M
    /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/api/requestcontext.go:101^M
    /usr/local/go/src/net/http/server.go:1618^M
    /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/apiserver/handlers.go:194^M
    /usr/local/go/src/runtime/asm_amd64.s:1998^M

I0715 12:14:23.387986    3511 handlers.go:164] GET /api/v1/nodes/10.180.12.88: (144.600059ms) 200 [[kubemark/v1.4.0 (linux/amd64) kubernetes/12483d9] 146.148.109.243:33217]
I0715 12:14:23.388150    3511 handlers.go:164] GET /api/v1/nodes/10.180.1.92: (154.464521ms) 200 [[kubemark/v1.4.0 (linux/amd64) kubernetes/12483d9] 104.197.214.41:58039]
panic: Header called after Handler finished [recovered]
        panic: Header called after Handler finished

goroutine 54829830 [running]:
panic(0x20f0220, 0xc821e07010)
        /usr/local/go/src/runtime/panic.go:481 +0x3e6
net/http.(*http2responseWriter).Header(0xc956288948, 0x7fe725a76c88)
        /usr/local/go/src/net/http/h2_bundle.go:4452 +0x78
k8s.io/kubernetes/pkg/httplog.(*respLogger).Header(0xc9702bd260, 0x0)
        /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/httplog/log.go:170 +0x3a
k8s.io/kubernetes/pkg/apiserver.(*baseTimeoutWriter).Header(0xc9c39fb9e0, 0x0)
        /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/apiserver/handlers.go:238 +0x39
k8s.io/kubernetes/pkg/apiserver.logStackOnRecover(0x7fe725d5b8a8, 0xc82014ad90, 0x20f0220, 0xc8f69cd900, 0x7fe725b36be8, 0xc8fc256620)
        /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/apiserver/apiserver.go:213 +0x2d9
k8s.io/kubernetes/pkg/apiserver.InstallRecoverHandler.func1(0x20f0220, 0xc8f69cd900, 0x7fe725b36be8, 0xc8fc256620)
        /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/apiserver/apiserver.go:195 +0x5e
k8s.io/kubernetes/vendor/github.com/emicklei/go-restful.(*Container).dispatch.func2(0xc82032cea0, 0xc83a235710)
        /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/github.com/emicklei/go-restful/container.go:206 +0x6a
panic(0x20f0220, 0xc8f69cd900)
        /usr/local/go/src/runtime/panic.go:443 +0x4e9
net/http.(*http2responseWriter).Header(0xc956288948, 0x2)
        /usr/local/go/src/net/http/h2_bundle.go:4452 +0x78
k8s.io/kubernetes/pkg/httplog.(*respLogger).Header(0xc9702bd260, 0x2decc58)
        /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/httplog/log.go:170 +0x3a
k8s.io/kubernetes/pkg/apiserver.(*baseTimeoutWriter).Header(0xc9c39fb9e0, 0x0)
        /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/apiserver/handlers.go:238 +0x39
k8s.io/kubernetes/pkg/apiserver/metrics.(*fancyResponseWriterDelegator).Header(0xc8fc256630, 0xc8bc4a8e10)
        <autogenerated>:8 +0x53
k8s.io/kubernetes/pkg/apiserver/metrics.InstrumentRouteFunc.func1(0xc8bee93a10, 0xc90d296780)
        /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/apiserver/metrics/metrics.go:102 +0x35c
k8s.io/kubernetes/vendor/github.com/emicklei/go-restful.(*Container).dispatch(0xc82032cea0, 0x7fe725b36be8, 0xc8fc256620, 0xc83c513260)
        /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/github.com/emicklei/go-restful/container.go:272 +0xf30
k8s.io/kubernetes/vendor/github.com/emicklei/go-restful.(*Container).(k8s.io/kubernetes/vendor/github.com/emicklei/go-restful.dispatch)-fm(0x7fe725b36be8, 0xc8fc256620, 0xc83c513260)
        /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/github.com/emicklei/go-restful/container.go:120 +0x3e
net/http.HandlerFunc.ServeHTTP(0xc820ceb980, 0x7fe725b36be8, 0xc8fc256620, 0xc83c513260)
        /usr/local/go/src/net/http/server.go:1618 +0x3a
net/http.(*ServeMux).ServeHTTP(0xc8202c07b0, 0x7fe725b36be8, 0xc8fc256620, 0xc83c513260)
        /usr/local/go/src/net/http/server.go:1910 +0x17d
k8s.io/kubernetes/pkg/apiserver.WithAuthorizationCheck.func1(0x7fe725b36be8, 0xc8fc256620, 0xc83c513260)
        /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/apiserver/handlers.go:485 +0xf4
net/http.HandlerFunc.ServeHTTP(0xc820313ac0, 0x7fe725b36be8, 0xc8fc256620, 0xc83c513260)
        /usr/local/go/src/net/http/server.go:1618 +0x3a
k8s.io/kubernetes/pkg/apiserver.WithImpersonation.func1(0x7fe725b36be8, 0xc8fc256620, 0xc83c513260)
        /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/apiserver/handlers.go:420 +0xfe
net/http.HandlerFunc.ServeHTTP(0xc820313b00, 0x7fe725b36be8, 0xc8fc256620, 0xc83c513260)
        /usr/local/go/src/net/http/server.go:1618 +0x3a
k8s.io/kubernetes/pkg/auth/handlers.NewRequestAuthenticator.func1(0x7fe725b36be8, 0xc8fc256620, 0xc83c513260)
        /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/auth/handlers/handlers.go:66 +0x2c9
net/http.HandlerFunc.ServeHTTP(0xc8203f5270, 0x7fe725b36be8, 0xc8fc256620, 0xc83c513260)
        /usr/local/go/src/net/http/server.go:1618 +0x3a
k8s.io/kubernetes/pkg/api.NewRequestContextFilter.func1(0x7fe725b36be8, 0xc8fc256620, 0xc83c513260)
        /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/api/requestcontext.go:101 +0x157
net/http.HandlerFunc.ServeHTTP(0xc820522660, 0x7fe725b36be8, 0xc8fc256620, 0xc83c513260)
        /usr/local/go/src/net/http/server.go:1618 +0x3a
k8s.io/kubernetes/pkg/api.NewRequestContextFilter.func1(0x7fe725b36be8, 0xc8fc256620, 0xc83c513260)
        /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/api/requestcontext.go:101 +0x157
net/http.HandlerFunc.ServeHTTP(0xc820522680, 0x7fe725b36be8, 0xc8fc256620, 0xc83c513260)
        /usr/local/go/src/net/http/server.go:1618 +0x3a
k8s.io/kubernetes/pkg/apiserver.(*timeoutHandler).ServeHTTP.func1(0xc8216ec9c0, 0x7fe725b36ba8, 0xc8fc256620, 0xc83c513260, 0xc90d296720)
        /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/apiserver/handlers.go:194 +0x91
created by k8s.io/kubernetes/pkg/apiserver.(*timeoutHandler).ServeHTTP
        /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/apiserver/handlers.go:196 +0x15b

It happened around 110k-120k Pods in the cluster.

It looks like a bug in http2 handling. cc @timothysc @smarterclayton @wojtek-t @lavalamp @kubernetes/sig-api-machinery

@gmarek gmarek added kind/bug Categorizes issue or PR as related to a bug. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. labels Jul 15, 2016
@gmarek gmarek added this to the v1.4 milestone Jul 15, 2016
@mwielgus mwielgus added the sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. label Jul 15, 2016
@lavalamp
Copy link
Member

@roberthbailey @smarterclayton so it begins :)

@timothysc
Copy link
Member

There is a log'd issue upstream, in process on godep update.

@roberthbailey
Copy link
Contributor

@roberthbailey @smarterclayton so it begins :)

You mean finding issues that were previously masked. Early in the 1.4 dev cycle seems like a great time to find those.

@smarterclayton
Copy link
Contributor

smarterclayton commented Jul 16, 2016 via email

k8s-github-robot pushed a commit that referenced this issue Jul 18, 2016
Automatic merge from submit-queue

Update golang.org/x/net godep for http2 issues being seen during scale

This is to address #29001 , 

"Header called after Handler finished".   There are a number of race fixes.
@timothysc
Copy link
Member

@gmarek post #29032 , could you let me know if you are still seeing an issue.

@gmarek
Copy link
Contributor Author

gmarek commented Jul 18, 2016

I didn't have time to run tests today, as I was fixing NodeController. I'm OOO for the rest of the week, but maybe @wojtek-t will run the test?

@wojtek-t
Copy link
Member

I will try to confirm that the fix works later this week. (Though the single run may not be enough, as this seemed to be a race). Anyway - I will keep this updated after running some tests later this week.

@wojtek-t
Copy link
Member

I have run an experiment over night, and this happened again (with the fix).

So this is still a problem.

@timothysc
Copy link
Member

k, given that you're still seeing a server race condition I'm inclined to default off again and perhaps try again in the 1.5 w/ golang 1.7 timeframe.

k8s-github-robot pushed a commit that referenced this issue Jul 20, 2016
Automatic merge from submit-queue

Revert "Follow on for 1.4 to default HTTP2 on by default"

This reverts commit efe2555  
in order to address: #29001 #28537
@timothysc
Copy link
Member

closed via #29283

@lavalamp
Copy link
Member

If you can crash apiserver with an HTTP 2 connection, then we should:
a) file a bug against the appropriate package
b) prevent HTTP 2 connections server-side.

b) is because otherwise it's super easy to dos apiserver.

On Wed, Jul 20, 2016 at 8:32 AM, Timothy St. Clair <notifications@github.com

wrote:

closed via #29283 #29283


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#29001 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAnglm7QKzm4L4TJFYYTMcXoocUpiaqNks5qXj96gaJpZM4JNWYK
.

@lavalamp lavalamp reopened this Jul 20, 2016
@wojtek-t
Copy link
Member

@lavalamp - I agree that we should prevent http 2 connections server-side, but can you clarify why it makes easier to dos apiserver? Is it because it will be crashlooping or something else?

@lavalamp
Copy link
Member

Yeah, a malicious client can just keep making http2 requests...

On Wed, Jul 20, 2016 at 11:00 AM, Wojciech Tyczynski <
notifications@github.com> wrote:

@lavalamp https://github.com/lavalamp - I agree that we should prevent
http 2 connections server-side, but can you clarify why it makes easier to
dos apiserver? Is it because it will be crashlooping or something else?


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#29001 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAnglrjj1881aZirRRfH78_qiJtPrdthks5qXmJFgaJpZM4JNWYK
.

@bradfitz
Copy link

Hi, I own Go's net/http and http2 stuff.

I saw golang/go#16451 but it appears that the bug is actually in Kubernetes.

When you get the error message:

"panic: Header called after Handler finished"

That means the ResponseWriter.Header() method was called after the ServeHTTP completed.

So, I went looking up your panic's call stack looking for suspects.

My first guess was fancyResponseWriterDelegator,

k8s.io/kubernetes/pkg/apiserver/metrics.(*fancyResponseWriterDelegator).Header(0xc8fc256630, 0xc8bc4a8e10)

... because anything named fancy is just asking for trouble. But it seems fine, despite having no docs on why it's fancy.

But working up the stack more, I see things with "timeout" in the name, which makes me think there are goroutines involved running ServeHTTP async. And sure enough:

k8s.io/kubernetes/pkg/apiserver/handlers.go:

func (t *timeoutHandler) ServeHTTP(w http.ResponseWriter, r *http.Request) {
        after, msg := t.timeout(r)
        if after == nil {
                t.handler.ServeHTTP(w, r)
                return
        }

        done := make(chan struct{}, 1)
        tw := newTimeoutWriter(w)
        go func() {
                t.handler.ServeHTTP(tw, r)
                done <- struct{}{}
        }()
        select {
        case <-done:
                return
        case <-after:
                tw.timeout(msg)
        }
}

... good, you're passing a custom ResponseWriter to the new ServeHTTP started in the goroutine.

So I'd expect newTimeoutWriter to handle not calling the underlying Header method once the original ServeHTTP was done.

But:

type timeoutWriter interface {
        http.ResponseWriter
        timeout(string)
}

func newTimeoutWriter(w http.ResponseWriter) timeoutWriter {
        base := &baseTimeoutWriter{w: w}
... 
               return base
}

type baseTimeoutWriter struct {
        w http.ResponseWriter

        mu          sync.Mutex
        timedOut    bool
        wroteHeader bool
        hijacked    bool
}

func (tw *baseTimeoutWriter) Header() http.Header {
        return tw.w.Header()
}

The baseTimeOutWriter's Header method doesn't have any synchronization. It's always accessing tw.w (which is the original ServeHTTP's goroutine's ResponseWriter), even the timeoutHandler.ServeHTTP has returned.

But https://golang.org/pkg/net/http/#ResponseWriter says:

A ResponseWriter may not be used after the Handler.ServeHTTP method has returned.

So, a data race (as is the case in HTTP/1 I imagine) or a panic (with HTTP/2) are both possible.

Arguably the HTTP/2 behavior is better because it told you there's a problem rather than having a silent data race which you might not notice unless your tests run under the race detector were particularly thorough concurrency/timeout-wise.

You'll need to need to make your timeoutHandler and its timeoutWriter a bit more aware of each other and synchronize any method calls on the original ServerHTTP's ResponseWriter with the original ServeHTTP's completion.

@lavalamp
Copy link
Member

Thank you for that write up, I knew the timeout handler was incredibly
problematic (it leaks goroutines for requests that hang!) but it seems it's
worse than I thought.

On Thu, Jul 21, 2016 at 2:36 PM, Brad Fitzpatrick notifications@github.com
wrote:

Hi, I own Go's net/http and http2 stuff.

I saw golang/go#16451 golang/go#16451 but it
appears that the bug is actually in Kubernetes.

When you get the error message:

"panic: Header called after Handler finished"

That means the ResponseWriter.Header() method was called after the
ServeHTTP completed.

So, I went looking up your panic's call stack looking for suspects.

My first guess was fancyResponseWriterDelegator,

k8s.io/kubernetes/pkg/apiserver/metrics.(*fancyResponseWriterDelegator).Header(0xc8fc256630, 0xc8bc4a8e10)

... because anything named fancy is just asking for trouble. But it seems
fine, despite having no docs on why it's fancy.

But working up the stack more, I see things with "timeout" in the name,
which makes me think there are goroutines involved running ServeHTTP async.
And sure enough:

k8s.io/kubernetes/pkg/apiserver/handlers.go:

func (t *timeoutHandler) ServeHTTP(w http.ResponseWriter, r *http.Request) {
after, msg := t.timeout(r)
if after == nil {
t.handler.ServeHTTP(w, r)
return
}

    done := make(chan struct{}, 1)
    tw := newTimeoutWriter(w)
    go func() {
            t.handler.ServeHTTP(tw, r)
            done <- struct{}{}
    }()
    select {
    case <-done:
            return
    case <-after:
            tw.timeout(msg)
    }

}

... good, you're passing a custom ResponseWriter to the new ServeHTTP
started in the goroutine.

So I'd expect newTimeoutWriter to handle not calling the underlying Header
method once the original ServeHTTP was done.

But:

type timeoutWriter interface {
http.ResponseWriter
timeout(string)
}
func newTimeoutWriter(w http.ResponseWriter) timeoutWriter {
base := &baseTimeoutWriter{w: w}
...
return base
}
type baseTimeoutWriter struct {
w http.ResponseWriter

    mu          sync.Mutex
    timedOut    bool
    wroteHeader bool
    hijacked    bool

}
func (tw *baseTimeoutWriter) Header() http.Header {
return tw.w.Header()
}

The baseTimeOutWriter's Header method doesn't have any synchronization.
It's always accessing tw.w (which is the original ServeHTTP's goroutine's
ResponseWriter), even the timeoutHandler.ServeHTTP has returned.

But https://golang.org/pkg/net/http/#ResponseWriter says:

A ResponseWriter may not be used after the Handler.ServeHTTP method has
returned.

So, a data race (as is the case in HTTP/1 I imagine) or a panic (with
HTTP/2) are both possible.

Arguably the HTTP/2 behavior is better because it told you there's a
problem rather than having a silent data race which you might not notice
unless your tests run under the race detector were particularly thorough
concurrency/timeout-wise.

You'll need to need to make your timeoutHandler and its timeoutWriter a
bit more aware of each other and synchronize any method calls on the
original ServerHTTP's ResponseWriter with the original ServeHTTP's
completion.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#29001 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAnglklJHeIudSmo4nMzUnidv35MNYGbks5qX-ZbgaJpZM4JNWYK
.

@smarterclayton
Copy link
Contributor

smarterclayton commented Jul 22, 2016 via email

@wojtek-t
Copy link
Member

Thanks for looking into it @bradfitz

@smarterclayton - can we use context timeouts without http2? If so, that might be the best option.
Also, I didn't look into the related code, but can we do it only this place for now?

@smarterclayton
Copy link
Contributor

We should start plumbing context for 1.4. That's an easy win, and a lot of
mechanics are already in place. Even if we don't get
attach/exec/portforward wired, we will be closer to where we want to be.

On Mon, Jul 25, 2016 at 1:37 PM, Daniel Smith notifications@github.com
wrote:

What we need is to have started out using the context package and made
every handler use the timeout feature sensibly. I don't think there's a
quick fix here, even if something drastic happened like adding context's
timeout functionality to http.Request: it doesn't do any good without
kubernetes code changes, too. Even more drastic things like the ability to
shut down other goroutines from the timeout handler would not fix the
problem, either, in addition to being terrible ideas.

On Mon, Jul 25, 2016 at 7:54 AM, Brad Fitzpatrick <
notifications@github.com>
wrote:

I don't know Kubernetes internals. What are you all trying to accomplish?
Is there something you need from net/http which is too hard or not
possible
right now? Maybe I can help?


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<
#29001 (comment)
,
or mute the thread
<
https://github.com/notifications/unsubscribe-auth/AAnglj3SgqXNF2egQt7Ff8IV9ed1w71yks5qZM4YgaJpZM4JNWYK

.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#29001 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/ABG_p2u6KZHavlUwYj__QH8x7joL4Ahcks5qZPRIgaJpZM4JNWYK
.

@xiang90
Copy link
Contributor

xiang90 commented Jul 25, 2016

@lavalamp @smarterclayton

Timeout handler and context are complementary for each other. They do not provide exact same functionality. It highly depends on what we are trying to achieve in long term.

Context on request can propagate deadline, cancellation to inner processors. It can help with shutting down go routines and fail fast. A lot of stuff cannot respect context, for example large decoding loop can take seconds.

Timeout handler, on the other hand, is used to write HTTP timeout header as long as the inner handler does not finish in the given timeout. It does not do things like cancellation.

Do we want both functionality or just what context provides?

@lavalamp
Copy link
Member

I guess we probably need both. Timeout handler needs to produce a response
writer that locks, and that somehow handles being hijacked...

On Mon, Jul 25, 2016 at 10:51 AM, Xiang Li notifications@github.com wrote:

@lavalamp https://github.com/lavalamp @smarterclayton
https://github.com/smarterclayton

Timeout handler and context are complementary for each other. They do not
provide exact same functionality. It highly depends on what we are trying
to achieve in long term.

Context on request can propagate deadline, cancellation to inner
processors. It can help with shutting down go routines and fail fast. A lot
of stuff cannot respect context, for example large decoding loop can take
seconds.

Timeout handler, on the other hand, is used to write HTTP timeout header
as long as the inner handler does not finish in the given timeout. It does
not do things like cancellation.

Do we want both functionality or just what context provides?


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#29001 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAnglqhe_z3pKVtEi3VJXXg1rK0MyvPpks5qZPeYgaJpZM4JNWYK
.

@xiang90
Copy link
Contributor

xiang90 commented Jul 25, 2016

@lavalamp hijacked is easier to "achieve" than "flush". They are all achievable with some limitation. Once we ever write anything to a HTTP response over the wire, it is impossible to timeout it. So there are good reasons for not supporting hijack and flush to not mess up HTTP responses.

There is another discussion happening on the original change here: #10656 (comment). Seems like we do not really rely either hijack and flush for timeout handler. I would suggest to go back to use standard lib. And then write code to respect context properly. And then add flush, hijack functionality if we really need to.

@liggitt
Copy link
Member

liggitt commented Jul 25, 2016

I am concerned about buffering all responses to memory just to support timeout (as the standard handler does)

@xiang90
Copy link
Contributor

xiang90 commented Jul 25, 2016

@liggitt Can you provide a better way to do it? If you ever write anything to the actual writer, there is no chance to get it back in my opinion. Also I do not think it really matters.

@bradfitz I think this (#29001 (comment)) is the main concern. Opinion?

@smarterclayton
Copy link
Contributor

I would maybe argue (as I started to above) that we need to handle all the
nested stuff correctly, but that we do need a hedge if someone screws up.
We do control in the code pretty well who can hijack - it's only a small
number of code paths. So i'm fairly convinced that we can solve the
timeout problem distinct from the hijack problem. If you timeout fetching
a list of all pods, we have a serious problem, and returning partial JSON /
protobuf is the least of our concerns. Doubling the memory requirements
with buffering for very large pod lists seems like the wrong direction.

On Mon, Jul 25, 2016 at 2:03 PM, Daniel Smith notifications@github.com
wrote:

I guess we probably need both. Timeout handler needs to produce a response
writer that locks, and that somehow handles being hijacked...

On Mon, Jul 25, 2016 at 10:51 AM, Xiang Li notifications@github.com
wrote:

@lavalamp https://github.com/lavalamp @smarterclayton
https://github.com/smarterclayton

Timeout handler and context are complementary for each other. They do not
provide exact same functionality. It highly depends on what we are trying
to achieve in long term.

Context on request can propagate deadline, cancellation to inner
processors. It can help with shutting down go routines and fail fast. A
lot
of stuff cannot respect context, for example large decoding loop can take
seconds.

Timeout handler, on the other hand, is used to write HTTP timeout header
as long as the inner handler does not finish in the given timeout. It
does
not do things like cancellation.

Do we want both functionality or just what context provides?


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<
#29001 (comment)
,
or mute the thread
<
https://github.com/notifications/unsubscribe-auth/AAnglqhe_z3pKVtEi3VJXXg1rK0MyvPpks5qZPeYgaJpZM4JNWYK

.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#29001 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/ABG_pybL874Tzu-utnsl1hDWVuta0cOYks5qZPpVgaJpZM4JNWYK
.

@xiang90
Copy link
Contributor

xiang90 commented Jul 25, 2016

@smarterclayton That is OK. Once we wrote anything to the connection, we just cannot expect the timeout handler to still work by sending a timeout header. It should just shutdown the connection instead. Agree?

@smarterclayton
Copy link
Contributor

Agree.

On Mon, Jul 25, 2016 at 2:50 PM, Xiang Li notifications@github.com wrote:

@smarterclayton https://github.com/smarterclayton That is OK. Once we
wrote anything to the connection, we just cannot expect the timeout handler
to still work by sending a timeout header. It should just shutdown the
connection instead. Agree?


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#29001 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/ABG_pz-jqPAtPMhTxcAXWGWgtg9m8xLjks5qZQWPgaJpZM4JNWYK
.

@lavalamp
Copy link
Member

Agree.

On Mon, Jul 25, 2016 at 12:00 PM, Clayton Coleman notifications@github.com
wrote:

Agree.

On Mon, Jul 25, 2016 at 2:50 PM, Xiang Li notifications@github.com
wrote:

@smarterclayton https://github.com/smarterclayton That is OK. Once we
wrote anything to the connection, we just cannot expect the timeout
handler
to still work by sending a timeout header. It should just shutdown the
connection instead. Agree?


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<
#29001 (comment)
,
or mute the thread
<
https://github.com/notifications/unsubscribe-auth/ABG_pz-jqPAtPMhTxcAXWGWgtg9m8xLjks5qZQWPgaJpZM4JNWYK

.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#29001 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAngliggYouo8rM4srrVsz-RRzQilDuNks5qZQe8gaJpZM4JNWYK
.

@bradfitz
Copy link

Btw, if the ServeHTTP goroutine panics, that will do the best possible thing for both HTTP/1 and HTTP/2. In HTTP/1, assuming you're replying with at least HTTP/1.1 and you've already flushed the headers so it's using HTTP chunking, it'll kill the TCP connection immediately without a proper 0-byte EOF chunk, so the peer will recognize the response as bogus. In HTTP/2 the server will just RST_STREAM the stream, leaving the TCP connection open, but resetting the stream to the peer so it'll have an error, like the HTTP/1 case.

You can use this to have your timeout writer send writes over a channel to the ServeHTTP goroutine in a for { select { ... } } loop, selecting on things like Write and Flush chans, the CloseNotifier chan, and a chan waiting on Request.Context().Done() or your timeout ticker, etc.

What are you wanting to use Hijack for? Unless you're explicitly doing WebSockets over HTTP/1.1, I would ignore Hijack. It basically has no other valid purpose. Hijack doesn't even work for HTTP/2, intentionally, since it doesn't mean anything in that context.

@lavalamp
Copy link
Member

We do websockets over HTTP/1.1, as it happens. We also do SPDY to support
mulitdirectional arbitrary binary data. (attach, exec, portforward to
processes running in the cluster).

On Mon, Jul 25, 2016 at 2:02 PM, Brad Fitzpatrick notifications@github.com
wrote:

Btw, if the ServeHTTP goroutine panics, that will do the best possible
thing for both HTTP/1 and HTTP/2. In HTTP/1, assuming you're replying with
at least HTTP/1.1 and you've already flushed the headers so it's using HTTP
chunking, it'll kill the TCP connection immediately without a proper 0-byte
EOF chunk, so the peer will recognize the response as bogus. In HTTP/2 the
server will just RST_STREAM the stream, leaving the TCP connection open,
but resetting the stream to the peer so it'll have an error, like the
HTTP/1 case.

You can use this to have your timeout writer send writes over a channel to
the ServeHTTP goroutine in a for { select { ... } } loop, selecting on
things like Write and Flush chans, the CloseNotifier chan, and a chan
waiting on Request.Context().Done() or your timeout ticker, etc.

What are you wanting to use Hijack for? Unless you're explicitly doing
WebSockets over HTTP/1.1, I would ignore Hijack. It basically has no other
valid purpose. Hijack doesn't even work for HTTP/2, intentionally, since it
doesn't mean anything in that context.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#29001 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAnglrtj1oPqkF_5odNsJlDekayA1Yy_ks5qZSRLgaJpZM4JNWYK
.

@bradfitz
Copy link

bradfitz commented Jul 25, 2016

We do websockets over HTTP/1.1, as it happens.

Okay, so use Hijacker for that only. I would avoid it everywhere else.

We also do SPDY to support mulitdirectional arbitrary binary data. (attach, exec, portforward to processes running in the cluster).

Why SPDY? SPDY's supposed to be dying and dead soon, and Go's HTTP/2 lets
you do the same. Within a Handler, just read the request body and write the
response body at the same time. Likewise on the client side. (except you'll
need to write into a pipe for the request body)

@lavalamp
Copy link
Member

SPDY is there because we wrote that before HTTP/2 was a thing, and we have
a ton of clients that need to continue to work, and even though we'd all
love to update it to use HTTP/2, we have more important priorities... :/

On Mon, Jul 25, 2016 at 2:16 PM, Brad Fitzpatrick notifications@github.com
wrote:

On Mon, Jul 25, 2016 at 2:12 PM, Daniel Smith notifications@github.com
wrote:

We do websockets over HTTP/1.1, as it happens.

Okay, so use Hijacker for that only. I would avoid it everywhere else.

We also do SPDY to support

mulitdirectional arbitrary binary data. (attach, exec, portforward to
processes running in the cluster).

Why SPDY? SPDY's supposed to be dying and dead soon, and Go's HTTP/2 lets
you do the same. Within a Handler, just read the request body and write the
response body at the same time. Likewise on the client side. (except you'll
need to write into a pipe for the request body)


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#29001 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAnglnoAbkcXvFXr2fKr0KdeFvDsTfb-ks5qZSe0gaJpZM4JNWYK
.

@smarterclayton
Copy link
Contributor

smarterclayton commented Jul 25, 2016 via email

@bradfitz
Copy link

@smarterclayton, @ncdc, I'm super curious, because the HTTP/2 in Go 1.6+ should be sufficient. Let me know how I can help.

@ncdc
Copy link
Member

ncdc commented Jul 26, 2016

@bradfitz we previously chatted about this in golang/go#13444. We have interfaces in Kubernetes that look like this:

// Connection represents an upgraded HTTP connection.
type Connection interface {
    // CreateStream creates a new Stream with the supplied headers.
    CreateStream(headers http.Header) (Stream, error)
    // Close resets all streams and closes the connection.
    Close() error
    // CloseChan returns a channel that is closed when the underlying connection is closed.
    CloseChan() <-chan bool
    // SetIdleTimeout sets the amount of time the connection may remain idle before
    // it is automatically closed.
    SetIdleTimeout(timeout time.Duration)
}

// Stream represents a bidirectional communications channel that is part of an
// upgraded connection.
type Stream interface {
    io.ReadWriteCloser
    // Reset closes both directions of the stream, indicating that neither client
    // or server can use it any more.
    Reset() error
    // Headers returns the headers used to create the stream.
    Headers() http.Header
    // Identifier returns the stream's ID.
    Identifier() uint32
}

We hijack the HTTP/1.1 connection and then use spdystream to create streams. We then io.Copy between the user's local stdin/stdout/stderr and the streams. In the Kubernetes nodes, we io.Copy between the streams and stdin/stdout/stderr in the container processes.

I assume we can use what's in Go 1.6 for HTTP/2 to use the request and response bodies for this purpose. It would require that we write our own multiplexing/framing code for our streams (in addition to the 3 I mentioned above, we also have an "error" stream and another for sending terminal resizing dimensions). I'm not sure it would be super easy or even possible to shoehorn an HTTP/2 implementation into the above interfaces, though.

@smarterclayton I don't believe there's currently an issue

@ncdc
Copy link
Member

ncdc commented Jul 26, 2016

@bradfitz oh, one other thing: because our data flow is client executable -> kube-apiserver -> kubelet -> container, we end up using a proxy to transmit the data between kube-apiserver and kubelet. This is how we're currently doing it:

// tryUpgrade returns true if the request was handled.
func (h *UpgradeAwareProxyHandler) tryUpgrade(w http.ResponseWriter, req *http.Request) bool {
if !httpstream.IsUpgradeRequest(req) {
return false
}
backendConn, err := proxy.DialURL(h.Location, h.Transport)
if err != nil {
h.Responder.Error(err)
return true
}
defer backendConn.Close()
requestHijackedConn, _, err := w.(http.Hijacker).Hijack()
if err != nil {
h.Responder.Error(err)
return true
}
defer requestHijackedConn.Close()
newReq, err := http.NewRequest(req.Method, h.Location.String(), req.Body)
if err != nil {
h.Responder.Error(err)
return true
}
newReq.Header = req.Header
if err = newReq.Write(backendConn); err != nil {
h.Responder.Error(err)
return true
}
wg := &sync.WaitGroup{}
wg.Add(2)
go func() {
var writer io.WriteCloser
if h.MaxBytesPerSec > 0 {
writer = flowrate.NewWriter(backendConn, h.MaxBytesPerSec)
} else {
writer = backendConn
}
_, err := io.Copy(writer, requestHijackedConn)
if err != nil && !strings.Contains(err.Error(), "use of closed network connection") {
glog.Errorf("Error proxying data from client to backend: %v", err)
}
wg.Done()
}()
go func() {
var reader io.ReadCloser
if h.MaxBytesPerSec > 0 {
reader = flowrate.NewReader(backendConn, h.MaxBytesPerSec)
} else {
reader = backendConn
}
_, err := io.Copy(requestHijackedConn, reader)
if err != nil && !strings.Contains(err.Error(), "use of closed network connection") {
glog.Errorf("Error proxying data from backend to client: %v", err)
}
wg.Done()
}()
wg.Wait()
return true
}
. In short, we:

  1. dial the kubelet (backendConn)
  2. hijack the request/response (requestHijackedConn)
  3. create and send a new http.Request to the kubelet that looks just like the incoming request (newReq)
  4. io.Copy in both directions between backendConn and requestHijackedConn

The last time I looked at this, http.NewRequest + request.Write(conn) sent the request as HTTP/1.1, which broke the flow. I was able to put together this hack (765b9c2#diff-4b9b9659e8f0abce3dab96b924f6b107R136) but it was just part of my POC and I haven't come back to it in months.

@timothysc
Copy link
Member

What is the status on the original issue, which is a race in plumbing problem.

k8s-github-robot pushed a commit that referenced this issue Aug 10, 2016
Automatic merge from submit-queue

apiserver: fix timeout handler

Protect access of the original writer. Panics if anything has wrote
into the original writer or the writer is hijacked when times out.

Fix #29001 

/cc @smarterclayton @lavalamp 

The next step would be respect the request context once 1.7 is out.

<!-- Reviewable:start -->
---
This change is [<img src="https://reviewable.kubernetes.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.kubernetes.io/reviews/kubernetes/kubernetes/29594)
<!-- Reviewable:end -->
@timothysc
Copy link
Member

re-opening, this is not closed.

@timothysc timothysc reopened this Aug 12, 2016
@xiang90
Copy link
Contributor

xiang90 commented Aug 12, 2016

@timothysc This should be fixed by #29594. Did you see this happen again?

@timothysc
Copy link
Member

k, we should open new other items discussed in this issue then too.

k8s-github-robot pushed a commit that referenced this issue Sep 26, 2016
Automatic merge from submit-queue

Default HTTP2 on, post fixes from #29001

This reverts commit 8cb799c.

Enables HTTP2 on by default post fixes from #29001 for 1.5 

NOTE:  We are nearing lb connection limits at current thresholds. 

/cc @bradfitz @lavalamp @smarterclayton
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery.
Projects
None yet
Development

No branches or pull requests