Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x/net/http: PROTOCOL_ERROR with HTTP2 #29125

Closed
siscia opened this issue Dec 6, 2018 · 25 comments
Closed

x/net/http: PROTOCOL_ERROR with HTTP2 #29125

siscia opened this issue Dec 6, 2018 · 25 comments
Labels
FrozenDueToAge WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided.
Milestone

Comments

@siscia
Copy link

siscia commented Dec 6, 2018

What version of Go are you using (go version)?

$ go version
go version go1.11.2 linux/amd64

Does this issue reproduce with the latest release?

Yes, which I believe is the one I am running, hence 1.11.2

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GOARCH="amd64"
GOBIN=""
GOCACHE="/home/simo/.cache/go-build"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOOS="linux"
GOPATH="/home/simo/go"
GOPROXY=""
GORACE=""
GOROOT="/usr/local/go"
GOTMPDIR=""
GOTOOLDIR="/usr/local/go/pkg/tool/linux_amd64"
GCCGO="gccgo"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD=""
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build210197689=/tmp/go-build -gno-record-gcc-switches"

What did you do?

I am downloading tar files from the docker hub, uncompress them using the standard gzip, and passing the result to another software using STDIN.

If is necessary I will try to reproduce this using something simpler like sha256 sum in linux.

What did you expect to see?

I would not expect a PROTOCOL_ERROR

What did you see instead?

A PROTOCOL_ERROR

Below the relevant log using GODEBUG=http2debug=2


INFO[0177] Deleting temporary directory                  action=ingesting target=/tmp/tempCatalog140887158
INFO[0177] Created subcatalog in directory               directory=.layers/bf
INFO[0177] [cvmfs_server ingest --catalog -t - -b .layers/bf/bfb904e99f247ef4b354eab1285ec78bc5bb929dbf134ff69c156b90711216f6/layerfs portals.cern.ch]  action=executing
2018/12/06 18:53:59 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=8192
2018/12/06 18:53:59 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=8192
2018/12/06 18:53:59 http2: Framer 0xc0004ba1c0: read RST_STREAM stream=21 len=4 ErrCode=PROTOCOL_ERROR
2018/12/06 18:53:59 http2: Transport received RST_STREAM stream=21 len=4 ErrCode=PROTOCOL_ERROR
2018/12/06 18:53:59 http2: Framer 0xc0004ba1c0: read RST_STREAM stream=23 len=4 ErrCode=PROTOCOL_ERROR
2018/12/06 18:53:59 http2: Transport received RST_STREAM stream=23 len=4 ErrCode=PROTOCOL_ERROR
2018/12/06 18:53:59 http2: Framer 0xc0004ba1c0: read RST_STREAM stream=19 len=4 ErrCode=PROTOCOL_ERROR
2018/12/06 18:53:59 http2: Transport received RST_STREAM stream=19 len=4 ErrCode=PROTOCOL_ERROR
2018/12/06 18:53:59 http2: Framer 0xc0004ba1c0: read RST_STREAM stream=27 len=4 ErrCode=PROTOCOL_ERROR
2018/12/06 18:53:59 http2: Transport received RST_STREAM stream=27 len=4 ErrCode=PROTOCOL_ERROR
2018/12/06 18:53:59 http2: Framer 0xc0004ba1c0: read RST_STREAM stream=13 len=4 ErrCode=PROTOCOL_ERROR
2018/12/06 18:53:59 http2: Transport received RST_STREAM stream=13 len=4 ErrCode=PROTOCOL_ERROR
INFO[0180] Process alive
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=8192
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=8192
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=8192
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=8192
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=8192
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=8192
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=8192
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=8192
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=8192
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=8192
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=8192

   ... this line repeat for exactly another 450 times, I cut it...

2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=8192
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=8192
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=8192
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=8192
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=8192
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=8192
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=8192
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=8192
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=8192
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=18886
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=16389
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=16389
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=16393
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=16389
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=16389
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=8192
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=8192
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=17397
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=12782
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=8192
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=8192
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=8192
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=8192
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=8192
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=8192
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=8192
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=8192
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=8192
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=14630
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=8192
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=8197
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=16389
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=8192
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=8201
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=16389
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=8192
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=8192
2018/12/06 18:54:02 http2: Framer 0xc0004ba1c0: wrote WINDOW_UPDATE stream=13 len=4 incr=8192
ERRO[0180] Error in copying the input into STDIN.        error="stream error: stream ID 13; PROTOCOL_ERROR"
INFO[0210] Process alive
2018/12/06 18:54:36 http2: Framer 0xc0004ba1c0: read RST_STREAM stream=17 len=4 ErrCode=PROTOCOL_ERROR
2018/12/06 18:54:36 http2: Transport received RST_STREAM stream=17 len=4 ErrCode=PROTOCOL_ERROR
INFO[0240] Process alive
INFO[0270] Process alive
INFO[0300] Process alive
2018/12/06 18:56:06 http2: Transport closing idle conn 0xc00047ddc0 (forSingleUse=false, maxStream=27)
2018/12/06 18:56:06 http2: Transport readFrame error on conn 0xc00047ddc0: (*net.OpError) read tcp 128.141.207.159:35866->104.18.123.25:443: use of closed network connection

Maybe is related to #26321 but I don't know how to tell.

@siscia
Copy link
Author

siscia commented Dec 6, 2018

Let me add that the problem arises seldom, so some downloads go just fine, while other present this problem.

The overall workflow is to get the tar.gz from the docker hub, pass the byte stream to "compress/gzip" (so something like gzip.NewReader(resp.Body) ) and finally pass the reader created by gzip to the STDIN of another software (using "os.exec" package).

Moreover, if instead of passing the result of the uncompression to the STDIN of another software I just save it on file I cannot reproduce the problem.

@agnivade agnivade changed the title PROTOCOL_ERROR with HTT2 x/net/http: PROTOCOL_ERROR with HTTP2 Dec 7, 2018
@gopherbot gopherbot added this to the Unreleased milestone Dec 7, 2018
@agnivade
Copy link
Contributor

agnivade commented Dec 7, 2018

Hi @siscia - can you post an independent code sample that we can use to reproduce this issue ?

@siscia
Copy link
Author

siscia commented Dec 7, 2018

Hi,

I was not able to re-produce without using cvmfs_server, the custom software that read from STDIN.

Maybe is a problem from our side but I don't know what it could be...

Should I make a small code sample that use cvmfs_server and reproduce the problem or I should just close the issue?

@agnivade
Copy link
Contributor

agnivade commented Dec 7, 2018

/cc @bradfitz

@bradfitz
Copy link
Contributor

@fraenkel, so you see anything wrong here?

@siscia, which HTTP/2 server is cvmfs_server using?

@bradfitz bradfitz added the WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided. label Dec 13, 2018
@fraenkel
Copy link
Contributor

So I am trying to find all the places we just report a stream error without any cause. There aren't many.
I did notice when we receive a WindowUpdateFrame and the increment is 0.
The other is parseHeadersFrame where we go negative.

Adding a cause to all places that return an error would at least identify where its coming from.

@bradfitz
Copy link
Contributor

@fraenkel I thought this was some mystery server sending us a PROTOCOL_ERROR because it didn't like our Transport's HTTP/2 behavior.

@siscia
Copy link
Author

siscia commented Dec 13, 2018

@bradfitz I see them against the docker registries, in particular I can see them against the standard registries hosted by docker hub itself.

I believe they are hosted on AWS.

Just to clarify, the Docker Registries servers are sending data, a go program is piping those data into the STDIN of another program, the cvmfs_server.

@fraenkel
Copy link
Contributor

@bradfitz Sorry I missed the RST_STREAM at the top. I find it odd that all we see are just WINDOW_UPDATE. The fact that "all" or many streams hit this at the same time could mean that we exceeded the connection window? I am guessing only because we are missing information.
What might be useful is to compare a successful run vs an unsuccessful run and see where the messages differ. It might just be in the exchange of WINDOW_UPDATEs but its just a guess.

@siscia
Copy link
Author

siscia commented Dec 22, 2018

Do you guys need any more info from my side?

@harshavardhana
Copy link
Contributor

It may be same as #31534 ?

@siscia
Copy link
Author

siscia commented May 7, 2019

It may it may not, I worked around the issue and I can't recal the problem!

Feel free to close though!

@harshavardhana
Copy link
Contributor

How did your workaround, if I may ask @siscia ?

@siscia
Copy link
Author

siscia commented May 7, 2019

I don't remember honestly, two options:

  1. Just download the file/layers using directly docker instead of an HTTP calls
  2. Swallow the error and just re-trying

I don't remember which option, but nothing technically interesting I am afraid :(

mvo5 added a commit to mvo5/snappy that referenced this issue Aug 12, 2019
We see some test failures (and probably real-world failures as
well) in the downloads with something like:

```
+ snap install test-snapd-content-circular1
error: cannot perform the following tasks:
- Download snap "test-snapd-content-circular2" (2) from channel "stable" (stream error: stream ID 1; PROTOCOL_ERROR)
```

Looking at the logs it seems that we get the protocol error from
fastly and that it is sent to us from the remote side:
```
Aug 09 20:03:16 aug091949-909109 snapd[7255]: 2019/08/09 20:03:16 http2: Transport failed to get client conn for fastly.cdn.snapcraft.io:443: http2: no cached connection was available
Aug 09 20:03:16 aug091949-909109 snapd[7255]: 2019/08/09 20:03:16 http2: Transport creating client conn 0xc4200baa80 to 151.101.250.217:443
...
c4204fbce0: wrote SETTINGS len=18, settings: ENABLE_PUSH=0, INITIAL_WINDOW_SIZE=4194304, MAX_HEADER_LIST_SIZE=10485760
Aug 09 20:03:16 aug091949-909109 snapd[7255]: 2019/08/09 20:03:16 http2: Framer 0xc4204fbce0: wrote WINDOW_UPDATE len=4 (conn) incr=1073741824
Aug 09 20:03:16 aug091949-909109 snapd[7255]: 2019/08/09 20:03:16 http2: Framer 0xc4204fbce0: wrote HEADERS flags=END_STREAM|END_HEADERS stream=1 len=869
Aug 09 20:03:16 aug091949-909109 snapd[7255]: 2019/08/09 20:03:16 http2: Framer 0xc4204fbce0: read SETTINGS len=6, settings: MAX_CONCURRENT_STREAMS=100
Aug 09 20:03:16 aug091949-909109 snapd[7255]: 2019/08/09 20:03:16 http2: Framer 0xc4204fbce0: wrote SETTINGS flags=ACK len=0
Aug 09 20:03:16 aug091949-909109 snapd[7255]: 2019/08/09 20:03:16 http2: Framer 0xc4204fbce0: read WINDOW_UPDATE len=4 (conn) incr=16711681
Aug 09 20:03:16 aug091949-909109 snapd[7255]: 2019/08/09 20:03:16 http2: Framer 0xc4204fbce0: read SETTINGS flags=ACK len=0
Aug 09 20:03:16 aug091949-909109 snapd[7255]: 2019/08/09 20:03:16 http2: Framer 0xc4204fbce0: read HEADERS flags=END_HEADERS stream=1 len=378
Aug 09 20:03:16 aug091949-909109 snapd[7255]: 2019/08/09 20:03:16 http2: Framer 0xc4204fbce0: read RST_STREAM stream=1 len=4 ErrCode=PROTOCOL_ERROR
Aug 09 20:03:16 aug091949-909109 snapd[7255]: 2019/08/09 20:03:16 http2: Framer 0xc4204fbce0: wrote RST_STREAM stream=1 len=4 ErrCode=CANCEL
...
```
There is a bugreport upstream that looks a bit like our issue:
golang/go#29125

and there retrying seemed to have solved the issue. We should
dig a bit deeper and see if we can build a small reproducer
using fastly so that we can figure out if go2 is doing something
that fastly is not expecting or if fastly is the issue. But in
the meantime this hopefully unblocks us.
@stan-sack
Copy link

I am seeing something similar to this. On the client side I see stream error: stream ID XXXXXX; PROTOCOL_ERROR and on the server I see stream error: stream ID XXXXXX; NO_ERROR when I try to read the request body with bodyBytes, err := ioutil.ReadAll(r.Body).

I've been able to get around this by using httpClient.Transport.RoundTrip(req) rather than httpClient.Do(req). This seems to reduce occurrences of the issue to basically zero but it also breaks all the timeouts I have set on the client and transport.

FYI I'm using a non tls http2 client:

	httpClient := &http.Client{
		Transport: &http2.Transport{
			AllowHTTP: true,
			DialTLS: func(netw, addr string, cfg *tls.Config) (net.Conn, error) {
				return dialer.Dial(netw, addr)
			},
		},
		Timeout: time.Duration(time.Second * 5),
	}

I'm unable to provide http2 logs because I only see the issue at scale.

@philippgille
Copy link

Just in case anyone else finds this GitHub issue after making the same mistake as me:

Using http.NewRequest("get", url, nil) instead of http.NewRequest("GET", url, nil) leads to the exact same error message:

stream error: stream ID 1; PROTOCOL_ERROR

I would love to have amore descriptive error message. Shouldn't it be possible to check against a list of valid (according to specification) HTTP verbs and then say the passed one is not one of them?

Anyway, I hope this helps someone.

@fraenkel
Copy link
Contributor

Using http.NewRequest("get", url, nil) instead of http.NewRequest("GET", url, nil) leads to the exact same error message:

There is http.MethodGet so you don't have to think about it.

@batara666

This comment was marked as resolved.

@davecheney

This comment was marked as resolved.

@seankhliao seankhliao added the WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided. label Dec 26, 2022
@gopherbot
Copy link
Contributor

Timed out in state WaitingForInfo. Closing.

(I am just a bot, though. Please speak up if this is a mistake or you have the requested information.)

@RPGillespie6
Copy link

RPGillespie6 commented Sep 6, 2023

Currently just a "me too", but I'm running into a problem very similar to @stan-sack where sometimes file uploads are terminated when trying to read request body with stream error: stream ID XXXXXX; NO_ERROR. This happens pretty reliably (though still intermittently) with the following setup:

.NET Framework HttpClient (HTTP/1.1) performing a multipart/form-data upload to a GCP load balancer over HTTPS with a H2C Cloud Run backend. This is with go 1.20.
image

I captured some additional debug details with GODEBUG=http2debug=2 as well:

DEFAULT 2023-09-06T17:24:06.628757Z echo: http2: server read frame RST_STREAM stream=167 len=4 ErrCode=NO_ERROR
ERROR 2023-09-06T17:24:06.628731486Z stream error: stream ID 167; NO_ERROR
DEFAULT 2023-09-06T17:24:06.628730Z 2023/09/06 17:24:06 frame.go:520: http2: Framer 0xc000139a40: read RST_STREAM stream=167 len=4 ErrCode=NO_ERROR
DEFAULT 2023-09-06T17:24:06.614934Z 2023/09/06 17:24:06 frame.go:393: http2: Framer 0xc000337b20: wrote WINDOW_UPDATE stream=97 len=4 incr=4096
DEFAULT 2023-09-06T17:24:06.614844Z 2023/09/06 17:24:06 frame.go:393: http2: Framer 0xc000337b20: wrote WINDOW_UPDATE len=4 (conn) incr=4096
DEFAULT 2023-09-06T17:24:06.614826Z echo: http2: server read frame DATA stream=97 len=4096 data="\x89?\x8a^\xdc\xdb\xf8\xfbrݨ\xffr\xb2\xedN\xc1\xe3Q\xc4\xcaԼ\xc7\xfdE\x95\xc2\u009a\xce^+\xc0\x1ac\x94p\xa4Ʉ\xed\xddW\"\xday&<\x13\xc9XjP\xeeӡ\xa7\xbf\rr\xd5\x10\xec\xfd\xd3\xc8\xe5o\x15\xfbU|\x1ev\xa59\xe0\xa3\x169#R\xcdE\xb9\x01\xdc\x13\xa6=\xb3\x14\xcc[\x00^\xcc\xe7\xb7\x12\xc0m\x9c\xbb\x188\x97'\xcb\x05\x1f\x04\x97\xdc-\xb7\x0ezt{x\x99\xfbg\xe4\x00\x7f\x85\xc0k;$\xeb2`\xb7\x13\xd0y<p\xfb\x16\x18\xff\x18\x98v/\xa3܉\xc4\xfa\x18#HCF1\x82\xda\xf5V {\x11\xbf\xb5\x91iA3\xde\xc7\x18\x18=rF\xdc\xfd\x85\xef%\x1cYQ`-w\x99\x03\xb4\x04J\xb4\x94\xe3\v\x82\xdcWeus\x13.d\x96F'Yg\xe8N\xa8J\x98m\xf7\x18\xd1\xc4;\x18u\r\x16\x00\xf1?\x03L\x1b\xb4\xd3\x12`d\xd0WP\x9b1<ײ\x00+\x8c\"0\xd2\xd8" (3840 bytes omitted)
DEFAULT 2023-09-06T17:24:06.614805Z 2023/09/06 17:24:06 frame.go:520: http2: Framer 0xc000337b20: read DATA stream=97 len=4096 data="\x89?\x8a^\xdc\xdb\xf8\xfbrݨ\xffr\xb2\xedN\xc1\xe3Q\xc4\xcaԼ\xc7\xfdE\x95\xc2\u009a\xce^+\xc0\x1ac\x94p\xa4Ʉ\xed\xddW\"\xday&<\x13\xc9XjP\xeeӡ\xa7\xbf\rr\xd5\x10\xec\xfd\xd3\xc8\xe5o\x15\xfbU|\x1ev\xa59\xe0\xa3\x169#R\xcdE\xb9\x01\xdc\x13\xa6=\xb3\x14\xcc[\x00^\xcc\xe7\xb7\x12\xc0m\x9c\xbb\x188\x97'\xcb\x05\x1f\x04\x97\xdc-\xb7\x0ezt{x\x99\xfbg\xe4\x00\x7f\x85\xc0k;$\xeb2`\xb7\x13\xd0y<p\xfb\x16\x18\xff\x18\x98v/\xa3܉\xc4\xfa\x18#HCF1\x82\xda\xf5V {\x11\xbf\xb5\x91iA3\xde\xc7\x18\x18=rF\xdc\xfd\x85\xef%\x1cYQ`-w\x99\x03\xb4\x04J\xb4\x94\xe3\v\x82\xdcWeus\x13.d\x96F'Yg\xe8N\xa8J\x98m\xf7\x18\xd1\xc4;\x18u\r\x16\x00\xf1?\x03L\x1b\xb4\xd3\x12`d\xd0WP\x9b1<ײ\x00+\x8c\"0\xd2\xd8" (3840 bytes omitted)
DEFAULT 2023-09-06T17:24:06.599709Z 2023/09/06 17:24:06 frame.go:393: http2: Framer 0xc000337b20: wrote WINDOW_UPDATE stream=97 len=4 incr=4096
DEFAULT 2023-09-06T17:24:06.599669Z 2023/09/06 17:24:06 frame.go:393: http2: Framer 0xc000337b20: wrote WINDOW_UPDATE len=4 (conn) incr=4096
DEFAULT 2023-09-06T17:24:06.599662Z echo: http2: server read frame DATA stream=97 len=4096 data="\x06\xb0\xab\xef\xebf0\xf1\r\xfcq-\xb0\xe9\xbf\x00\x1e\x8d\xa7\xbdֶ\xa9`\x91\xda\xc2Ō4\x91q\x1c\x7f\n\x1e\xb7\x03\xa8\xfd\xf5\t\x14#\x8e\xc6p?\xf83\x8e\x9e&\xc2q\xe7\x80c\xef\x90\xcdw7\xfdm\x03\xf7\xb7AW\x0eD\xe7\xd7\xee\xdf\t\xec\xcf!\xfa\xa7I\xcas`\x9d:\xf8HZ\xef\x80G^\xcfH\xab\xc0{\xf9\xf7B$\x91\xc3\xc8\a\xf8\x1a\x8b\xfe:\xdam\xa6\xff\xd9\xf4\x7f\x15\U000fd279\\\xc7\xcaL\xe0{\nw\xadF\xbb\xc8\fj\a>\xc4L\xc71\xfahV@\xc9}\x0e\xf4A,\x9f\xcey)\xdfx\xda>\xce\xfa\x9aU<\v\xfa2\x91g_\t\x95Q6\xb8\x96\xa3M<\xe7\x14\xeeU\xb7_\xc5U\xb5&\xb5\x01c\x01Ղ\xacD\xa65\xe5r\xfa3\xfeI\xeb\x80~E5\x80\xab\x19Qj\xab=p\x1c\xf7Nf4\xed\x13\xff@\u07b8\x9c>'3\xb69M\xff\xa8\xbe!\xfe\x93\xd8\x1dhx\x97a\xdduW%" (3840 bytes omitted)
DEFAULT 2023-09-06T17:24:06.599646Z 2023/09/06 17:24:06 frame.go:520: http2: Framer 0xc000337b20: read DATA stream=97 len=4096 data="\x06\xb0\xab\xef\xebf0\xf1\r\xfcq-\xb0\xe9\xbf\x00\x1e\x8d\xa7\xbdֶ\xa9`\x91\xda\xc2Ō4\x91q\x1c\x7f\n\x1e\xb7\x03\xa8\xfd\xf5\t\x14#\x8e\xc6p?\xf83\x8e\x9e&\xc2q\xe7\x80c\xef\x90\xcdw7\xfdm\x03\xf7\xb7AW\x0eD\xe7\xd7\xee\xdf\t\xec\xcf!\xfa\xa7I\xcas`\x9d:\xf8HZ\xef\x80G^\xcfH\xab\xc0{\xf9\xf7B$\x91\xc3\xc8\a\xf8\x1a\x8b\xfe:\xdam\xa6\xff\xd9\xf4\x7f\x15\U000fd279\\\xc7\xcaL\xe0{\nw\xadF\xbb\xc8\fj\a>\xc4L\xc71\xfahV@\xc9}\x0e\xf4A,\x9f\xcey)\xdfx\xda>\xce\xfa\x9aU<\v\xfa2\x91g_\t\x95Q6\xb8\x96\xa3M<\xe7\x14\xeeU\xb7_\xc5U\xb5&\xb5\x01c\x01Ղ\xacD\xa65\xe5r\xfa3\xfeI\xeb\x80~E5\x80\xab\x19Qj\xab=p\x1c\xf7Nf4\xed\x13\xff@\u07b8\x9c>'3\xb69M\xff\xa8\xbe!\xfe\x93\xd8\x1dhx\x97a\xdduW%" (3840 bytes omitted)
DEFAULT 2023-09-06T17:24:06.593946Z 2023/09/06 17:24:06 frame.go:393: http2: Framer 0xc000337b20: wrote WINDOW_UPDATE stream=97 len=4 incr=4096
DEFAULT 2023-09-06T17:24:06.593942Z echo: http2: server read frame DATA stream=97 len=4096 data="\xc2Q{J\x13<Or\xef)\xb7]\xa8\xd6r՞\xb9\xb6\x04}t\x0e\xec\xdc\x01\xe8\xfe\xa9\xf8\xe5Z\xa0 \xf5$\xcf\xfc\xdd\x1d\x8e\xe8<\xfb\xc1\xb5>~[\x8d<vO\x9f\xfd\xe0<6\f\xb1\x17\x18\x8e\x9c\xb1~\xb65\x1e\xae\xb4p߅\x14\x18\x0e\xbe\xcbɝ\xab\x95\xe8\x8dg\u0093\xfc\xfc\xe0\xf70\x03}\x0e\xe6\xfe\xe7\xa1w\xf8]\xf6\r\xa0\xfe\x17\xa1\xa7\xab\x87\xef\x1d>|\xee3s\x1b0\xa7\xfd\x18V\xcb߇\xd1\xeb=\x90\xd4k!ݗC\x7f.\xc5\xc8\xf7\xea\xd1x\xfa\x1ax\xbd\x17c\x94{\t4\xe3BH\xee\x8d\xd0\xce\x0fC\xa2\x1f\xdf\\7\xf7\x19h\xcd+1\xea\x9f\x0f\x9f\xf6\xe1\xd8\xf3\xbex\xf8dx\xf67\xa1\xf4\xa5Ѝ\xcfa\x1c|=t\x90\xb3\xf0\x9b0\xd6_\x0f<\xaf\xc1\xf3V\xe8\xc2_`\xb4\xbe\x19\xda\xc0\x93z\x9f\x03\xbe\xdf\xc1\xbe\xfd=pr\xff\xf3\x98\x03\xbc\x1d\xda\xf2E\x8c\xe9\xdb0\xfa\xf3]_\x9e\x06\xe4" (3840 bytes omitted)
DEFAULT 2023-09-06T17:24:06.593941Z 2023/09/06 17:24:06 frame.go:393: http2: Framer 0xc000337b20: wrote WINDOW_UPDATE len=4 (conn) incr=4096
DEFAULT 2023-09-06T17:24:06.593919Z 2023/09/06 17:24:06 frame.go:520: http2: Framer 0xc000337b20: read DATA stream=97 len=4096 data="\xc2Q{J\x13<Or\xef)\xb7]\xa8\xd6r՞\xb9\xb6\x04}t\x0e\xec\xdc\x01\xe8\xfe\xa9\xf8\xe5Z\xa0 \xf5$\xcf\xfc\xdd\x1d\x8e\xe8<\xfb\xc1\xb5>~[\x8d<vO\x9f\xfd\xe0<6\f\xb1\x17\x18\x8e\x9c\xb1~\xb65\x1e\xae\xb4p߅\x14\x18\x0e\xbe\xcbɝ\xab\x95\xe8\x8dg\u0093\xfc\xfc\xe0\xf70\x03}\x0e\xe6\xfe\xe7\xa1w\xf8]\xf6\r\xa0\xfe\x17\xa1\xa7\xab\x87\xef\x1d>|\xee3s\x1b0\xa7\xfd\x18V\xcb߇\xd1\xeb=\x90\xd4k!ݗC\x7f.\xc5\xc8\xf7\xea\xd1x\xfa\x1ax\xbd\x17c\x94{\t4\xe3BH\xee\x8d\xd0\xce\x0fC\xa2\x1f\xdf\\7\xf7\x19h\xcd+1\xea\x9f\x0f\x9f\xf6\xe1\xd8\xf3\xbex\xf8dx\xf67\xa1\xf4\xa5Ѝ\xcfa\x1c|=t\x90\xb3\xf0\x9b0\xd6_\x0f<\xaf\xc1\xf3V\xe8\xc2_`\xb4\xbe\x19\xda\xc0\x93z\x9f\x03\xbe\xdf\xc1\xbe\xfd=pr\xff\xf3\x98\x03\xbc\x1d\xda\xf2E\x8c\xe9\xdb0\xfa\xf3]_\x9e\x06\xe4" (3840 bytes omitted)
DEFAULT 2023-09-06T17:24:06.592901Z 2023/09/06 17:24:06 frame.go:393: http2: Framer 0xc000337b20: wrote WINDOW_UPDATE stream=97 len=4 incr=4096
DEFAULT 2023-09-06T17:24:06.592895Z 2023/09/06 17:24:06 frame.go:393: http2: Framer 0xc000337b20: wrote WINDOW_UPDATE len=4 (conn) incr=4096
DEFAULT 2023-09-06T17:24:06.592888Z echo: http2: server read frame DATA stream=97 len=4096 data="\x0f\xfemp\xe7\xb9\a7\x7f\xd9|\xb6y'<\xff\xdfl~\xbdy7F\xe0w@\v\xff\xb2\xe1\xda\xff\xb7\x9a߆\xae~\bZ\xfaq\xccӛf\x06\xa3\xf8\xfd0ҟ\x84?\xee\x00l\x82\x8f@_\xffl\xd8\x03\xbe\xfd\xbf\x0f\x9a\xbf\x17~\xff\xfe\x91_\xb0\x15\xf3\x82U\xd0\xff]X\xf1\xfb\x1f\xcd\a\x9bOb.\xf1\x0f\xd8U\xfc.\xe0\xfdl\xf3\x16\x8c\xfa\xaf\xc3\xef\x1b\x9a\x977\xafh.m.h^<\x9a\x11\xbc\x1ew\b}\n\xeb\x82?=\xb7j\xf8t\xf8\xf5\xfbFc?\xe7\xdb\xfcv\x95\xf1\x95\xb7\x06p\xe5\xebA\xa3\xd9\xc1)\xb0\x93\xe7\xa3E?\x04\x8d\x7f%\xf67\xdf\f\xbe}\x14\xf3\xa4O\xe3,\xe1[\x11\xfa\x02\xe6\n\xb7`\xb6\xf4xX\x82\x1f\x04\x1c\xbf\x11\xf1P\xf0\x83g\xbf\x88\xcf4h+8f^3o\x8cW\xff\xf1\xbd\x1e\xf3\x12\xa4g&\x01\xb6\xde\xcf\xf2ˀ\xcfW\xff\xc9\x7f\xf6\xab\xa4\xc2\xf0\xd0:\xab\xbc\xd7(\xad\xd7" (3840 bytes omitted)
DEFAULT 2023-09-06T17:24:06.592868Z 2023/09/06 17:24:06 frame.go:520: http2: Framer 0xc000337b20: read DATA stream=97 len=4096 data="\x0f\xfemp\xe7\xb9\a7\x7f\xd9|\xb6y'<\xff\xdfl~\xbdy7F\xe0w@\v\xff\xb2\xe1\xda\xff\xb7\x9a߆\xae~\bZ\xfaq\xccӛf\x06\xa3\xf8\xfd0ҟ\x84?\xee\x00l\x82\x8f@_\xffl\xd8\x03\xbe\xfd\xbf\x0f\x9a\xbf\x17~\xff\xfe\x91_\xb0\x15\xf3\x82U\xd0\xff]X\xf1\xfb\x1f\xcd\a\x9bOb.\xf1\x0f\xd8U\xfc.\xe0\xfdl\xf3\x16\x8c\xfa\xaf\xc3\xef\x1b\x9a\x977\xafh.m.h^<\x9a\x11\xbc\x1ew\b}\n\xeb\x82?=\xb7j\xf8t\xf8\xf5\xfbFc?\xe7\xdb\xfcv\x95\xf1\x95\xb7\x06p\xe5\xebA\xa3\xd9\xc1)\xb0\x93\xe7\xa3E?\x04\x8d\x7f%\xf67\xdf\f\xbe}\x14\xf3\xa4O\xe3,\xe1[\x11\xfa\x02\xe6\n\xb7`\xb6\xf4xX\x82\x1f\x04\x1c\xbf\x11\xf1P\xf0\x83g\xbf\x88\xcf4h+8f^3o\x8cW\xff\xf1\xbd\x1e\xf3\x12\xa4g&\x01\xb6\xde\xcf\xf2ˀ\xcfW\xff\xc9\x7f\xf6\xab\xa4\xc2\xf0\xd0:\xab\xbc\xd7(\xad\xd7" (3840 bytes omitted)
DEFAULT 2023-09-06T17:24:06.591132Z 2023/09/06 17:24:06 frame.go:393: http2: Framer 0xc000337b20: wrote WINDOW_UPDATE stream=97 len=4 incr=4096
DEFAULT 2023-09-06T17:24:06.591091Z 2023/09/06 17:24:06 frame.go:393: http2: Framer 0xc000337b20: wrote WINDOW_UPDATE len=4 (conn) incr=4096
DEFAULT 2023-09-06T17:24:06.591072Z echo: http2: server read frame DATA stream=97 len=4096 data="Ӧ\xd8:\xb3\x82\xe2.ybV\x800\xd6[\xe2\x12\x9fҰm#\x8b\xe8\xe70\"\x8c\xee>a\x9aNGʷ\xe6.Z\x84\xed\xfe\x9a\x96\x9f\xd3\x17<\xcfc\x96=\xe3\x14\tʞ\xb4}\x16\xcak\xb5Z\x1d\xa6\x1e\xcb\xfeo\x98\x9f\x01d\f\x8cm\x86\xe6\x9f<\xea\xf9\r\xd0\x05\x9e_]\x8b\x99X\xd7w\xd0\xda号\x94\xbd\xf0+\x0f\xc0\a؍\xb6RGi\xf38֚T\xe4\xd6p\x8d@gߔ\xc3u\x12\x9d\xf2e\x9av\x1d]\xe7$y\x92#B\xb9\x85\xcf}\x9e\xbd\xa7\xe8]\xa8\xbel[\x98\xcaﶨ6ڌ\xbdH\xe3\x9f\xedD\xd2V\xb0\xd7M\xf39RZ.ϖ\xc83\xe0;ϖ*Y\xb3\xb5/\xce98{ \xf65\xe0\xd0\x1e\xac\xef\xbf\x0f\xef\xec>\r3\xf0\xc7\xc0C\xfc\x01h%W\x8f\xb6b\x8e\xbe\x1d\xfb\x7f\xff>\xf8\x87\xe17\xe7>\x83\x95\xfd\x953\xf7\x98\xb97\xc6\xf5\xbbC\xb77@\xbb\xf7\xe1\xf7\xfe" (3840 bytes omitted)
DEFAULT 2023-09-06T17:24:06.591031Z 2023/09/06 17:24:06 frame.go:520: http2: Framer 0xc000337b20: read DATA stream=97 len=4096 data="Ӧ\xd8:\xb3\x82\xe2.ybV\x800\xd6[\xe2\x12\x9fҰm#\x8b\xe8\xe70\"\x8c\xee>a\x9aNGʷ\xe6.Z\x84\xed\xfe\x9a\x96\x9f\xd3\x17<\xcfc\x96=\xe3\x14\tʞ\xb4}\x16\xcak\xb5Z\x1d\xa6\x1e\xcb\xfeo\x98\x9f\x01d\f\x8cm\x86\xe6\x9f<\xea\xf9\r\xd0\x05\x9e_]\x8b\x99X\xd7w\xd0\xda号\x94\xbd\xf0+\x0f\xc0\a؍\xb6RGi\xf38֚T\xe4\xd6p\x8d@gߔ\xc3u\x12\x9d\xf2e\x9av\x1d]\xe7$y\x92#B\xb9\x85\xcf}\x9e\xbd\xa7\xe8]\xa8\xbel[\x98\xcaﶨ6ڌ\xbdH\xe3\x9f\xedD\xd2V\xb0\xd7M\xf39RZ.ϖ\xc83\xe0;ϖ*Y\xb3\xb5/\xce98{ \xf65\xe0\xd0\x1e\xac\xef\xbf\x0f\xef\xec>\r3\xf0\xc7\xc0C\xfc\x01h%W\x8f\xb6b\x8e\xbe\x1d\xfb\x7f\xff>\xf8\x87\xe17\xe7>\x83\x95\xfd\x953\xf7\x98\xb97\xc6\xf5\xbbC\xb77@\xbb\xf7\xe1\xf7\xfe" (3840 bytes omitted)
DEFAULT 2023-09-06T17:24:06.583750Z 2023/09/06 17:24:06 frame.go:393: http2: Framer 0xc000336540: wrote WINDOW_UPDATE stream=213 len=4 incr=4096
DEFAULT 2023-09-06T17:24:06.583744Z 2023/09/06 17:24:06 frame.go:393: http2: Framer 0xc000336540: wrote WINDOW_UPDATE stream=213 len=4 incr=4096
DEFAULT 2023-09-06T17:24:06.583737Z 2023/09/06 17:24:06 frame.go:393: http2: Framer 0xc000336540: wrote WINDOW_UPDATE len=4 (conn) incr=4096
DEFAULT 2023-09-06T17:24:06.583693Z 2023/09/06 17:24:06 frame.go:393: http2: Framer 0xc000336540: wrote WINDOW_UPDATE len=4 (conn) incr=4096

This is the error I see on the client side when it happens:

2023-09-06T17:55:44.892Z ERROR   upstream max stream duration reached
2023-09-06T17:55:44.892Z ERROR   Result Fetcher: RunModelJobs exception:Error parsing undefined value. Path '', line 1, position 1.
2023-09-06T17:55:44.923Z ERROR      at Newtonsoft.Json.JsonTextReader.ParseUndefined()
   at Newtonsoft.Json.JsonTextReader.ParseValue()
   at Newtonsoft.Json.JsonReader.ReadForType(JsonContract contract, Boolean hasConverter)
   at Newtonsoft.Json.Serialization.JsonSerializerInternalReader.Deserialize(JsonReader reader, Type objectType, Boolean checkAdditionalContent)
   at Newtonsoft.Json.JsonSerializer.DeserializeInternal(JsonReader reader, Type objectType)
   at Newtonsoft.Json.JsonConvert.DeserializeObject(String value, Type type, JsonSerializerSettings settings)
   at Newtonsoft.Json.JsonConvert.DeserializeObject[T](String value, JsonSerializerSettings settings)
   at Product.Client.ProductClientUtils.ThrowIfNotSuccessful(HttpResponseMessage response, ILogger logger) in C:\dev\cc\Product.Client\ProductClientUtils.cs:line 19
   at Product.Client.ProductClientV1.ProductClientV1.<UploadImageChunked>d__43.MoveNext() in C:\dev\cc\Product.Client\ProductClientV1\ProductClientV1.cs:line 310
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Product.Services.ProductResultFetcher2.<UploadImageData>d__12.MoveNext() in C:\dev\cc\Product.Common\Services\ProductResultFetcher2.cs:line 118
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Product.Services.ProductResultFetcher2.<UploadImageDataWrapper>d__14.MoveNext() in C:\dev\cc\Product.Common\Services\ProductResultFetcher2.cs:line 149
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Product.Services.ProductResultFetcher2.<SubmitJobsWrapper>d__16.MoveNext() in C:\dev\cc\Product.Common\Services\ProductResultFetcher2.cs:line 208
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Product.Services.ProductResultFetcher2.<RunModelJobs>d__29.MoveNext() in C:\dev\cc\Product.Common\Services\ProductResultFetcher2.cs:line 554
2023-09-06T17:59:59.566Z INFO    Uploading Imageset (chunked)...
2023-09-06T18:00:09.030Z INFO    Uploading Imageset (chunked)...
2023-09-06T18:00:46.095Z ERROR   upstream max stream duration reached
2023-09-06T18:00:46.095Z ERROR   Result Fetcher: RunModelJobs exception:Error parsing undefined value. Path '', line 1, position 1.
2023-09-06T18:00:46.096Z ERROR      at Newtonsoft.Json.JsonTextReader.ParseUndefined()
   at Newtonsoft.Json.JsonTextReader.ParseValue()

@RPGillespie6
Copy link

RPGillespie6 commented Sep 6, 2023

upstream max stream duration reached leads me to believe some GCP component is resetting the stream? Maybe Envoy, somehow? This may not be go related, but rather GCP related.

@stan-sack, was your issue happening in GCP and/or Envoy proxy?

@stan-sack
Copy link

I was calling non-TLS HTTP/2 (h2c) between two internal services via envoy client load balancing. It was in AWS and there was no actual load balancer.

@RPGillespie6
Copy link

Seems suspicious though. In my case, envoy (inside GCP load balancer) is also calling non-TLS HTTP/2 (h2c)

@stan-sack
Copy link

This was a long time ago. I dont work at the company where I had this problem anymore.

If I was still trying to solve I'd start playing with TCP keepalive and connection pooling in both go and envoy. I didnt have very much experience with envoy at that time but in hindsight it seems like it could be the source of the problem.

@golang golang locked and limited conversation to collaborators Sep 5, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided.
Projects
None yet
Development

No branches or pull requests