Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Heartbeat] Fix excessive memory usage when parsing bodies #15639

Merged
merged 11 commits into from
Jan 22, 2020

Conversation

andrewvc
Copy link
Contributor

@andrewvc andrewvc commented Jan 17, 2020

In 7.5 we introduced a regression where we would allocate a 100MiB buffer per HTTP request. This change uses a growable byte buffer instead.

I'm actually a bit surprised this caused a noticeable increase in heap usage, since I'd have assumed that the GC would recover the 100MiB buffer almost instantaneously, but that's clearly incorrect.

In my testing of various configs I saw a dramatic drop, with memory not exceeding 50MiB where before it would be consistently > than 500MiB

In 7.5 we introduced a regression where we would allocate a 100MiB
buffer per HTTP request. This change uses a growable byte buffer
instead.
@andrewvc andrewvc added bug Heartbeat Team:obs-ds-hosted-services Label for the Observability Hosted Services team v7.6.0 labels Jan 17, 2020
@andrewvc andrewvc requested a review from a team as a code owner January 17, 2020 15:17
@andrewvc andrewvc self-assigned this Jan 17, 2020
@elasticmachine
Copy link
Collaborator

Pinging @elastic/uptime (:uptime)

Comment on lines 128 to 131
prefixBytes := prefixBuf.Bytes()
if utf8.Valid(prefixBytes) {
prefix = string(prefixBytes)
}
Copy link
Contributor

@shahzad31 shahzad31 Jan 17, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't it make more sense to validate utf in for loop , chunk by chunk? and even if one chunk is invalid , break the loop and discard it. That way, it won't have to read remaining body.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm pretty sure that would break on multi-byte char boundaries. Thinking about it, even the current code is probably broken on multi-byte unicode > 100MiB where the 100MiB limit falls in the middle of a char.

Looking into it, our unicode detection does nothing here, I've removed it in this PR to keep things simple. I've opened #15647 to follow-up on this.

prefixBuf := make([]byte, maxPrefixSize)
prefixRemainingBytes := maxPrefixSize
prefixWriteOffset := 0
var prefixBuf bytes.Buffer
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tip: consider strings.Builder. strings.Builder also implements an io.Reader. But when calling (bytes.Buffer).String() you create a copy, while (strings.Builder).String() returns you the buffer as is (no extra copy).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Switched to a strings.Builder

@@ -128,9 +123,5 @@ func readPrefixAndHash(body io.ReadCloser, maxPrefixSize int) (respSize int, pre
}
}
Copy link

@urso urso Jan 17, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if the whole loop can be replaced using the io package only. e.g.

hash := sha256.New()
var prefixBuf strings.Builder
n, err := io.Copy(&prefixBuf, io.TeeReader(io.LimitReader(body, maxPrefixSize), hash))

This reads up to maxPrefixSize bytes from the body and writes every single byte to hash and prefixBuf.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ooooooh, that's nice, will give it a shot.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm actually wondering now, the only reason we do the prefix reading in the first place is to give something for the response checkers to read. The problem before was that if you used a JSON checker and a grep checker, they'd share the same IO, so one would use up the stream before the other could get to work.

Maybe we should just use the tee reader in the first place to split the stream between all the users, then we can always read the full body and not worry about reading some fixed quantity.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, I think that's a bit out of scope here. I'd like to keep the code as-is.

I tried to get things cleaner with the TeeReader, but that really requires us to have more of a limitWriter than a limitReader. The code sample you posted unfortunately only hashes the prefix, whereas the current code hashes the full body.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the hash it should be enough to add another io.Copy(body, hash) if the first io.Copy did not reach EOF. It's still the same reader, and readers are stateful. The second copy should continue appending to the hash from the last known position. Like:

hash := sha256.New()
var prefixBuf strings.Builder
n, err := io.Copy(&prefixBuf, io.TeeReader(io.LimitReader(body, maxPrefixSize), hash))
if err == nil {
  // finish streaming into hash if the body has not been fully consumed yet
  var m int
  m, err = io.Copy(hash, body)
  n += m
}

Alternatively we could introduce a LimitedWriter and a TeeWriter (as you've already noticed). They don't exist in our code base, so we would need to implement these ourselves :(. Then the operation would become:

io.Copy(TeeWriter(LimitedWriter(prefixBuf, ...), hash), body)

The idea is similar to my first sample code. Let's try to reduce code handling buffers, indices, offsets and size limits in loops by replacing them using more declarative (reusable?) reader/writer constructors.

Unfortunately parsers like JSON or regexp support are pull based. Meaning they require an io.Reader in order to "pull" new bytes. Constructing pipelines using a mix of pull and push (reader and writer) can still be somewhat tricky.

At least for JSON I have an alternative. In Beats we ues go-structform for spooling to disk and in our outputs. Interestingly the JSON parser has a push based interface: https://github.com/elastic/go-structform/blob/master/json/parse.go#L154

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've fixed the code to use io.Copy, much cleaner, thanks!

Thinking about it, I think I'm fine with the buffer approach for now. Chances are that users dealing with large files don't want to use either JSON or regexp matching. I think I'm fine waiting for that feature request to come in.

@andrewvc
Copy link
Contributor Author

@urso if you're able to leave a final review giving this an LGTM (or not!) I think it's ready.

// We discard the body if it is not valid UTF-8
if utf8.Valid(prefixBuf[:prefixWriteOffset]) {
prefix = string(prefixBuf[:prefixWriteOffset])
if err != nil {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change in semantics. The original code used to break if err == io.EOF.

This should fix it:
if err != nil && err != io.EOF {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, fixed and pushed. I think there are no other outstanding issues.

@andrewvc andrewvc merged commit 080dedb into elastic:master Jan 22, 2020
andrewvc added a commit to andrewvc/beats that referenced this pull request Jan 22, 2020
…5639)

* [Heartbeat] Fix excessive memory usage when parsing bodies

In 7.5 we introduced a regression where we would allocate a 100MiB
buffer per HTTP request. This change fixes that to stream data instead.

(cherry picked from commit 080dedb)
andrewvc added a commit to andrewvc/beats that referenced this pull request Jan 22, 2020
…5639)

* [Heartbeat] Fix excessive memory usage when parsing bodies

In 7.5 we introduced a regression where we would allocate a 100MiB
buffer per HTTP request. This change fixes that to stream data instead.

(cherry picked from commit 080dedb)
andrewvc added a commit that referenced this pull request Jan 23, 2020
…15763)

* [Heartbeat] Fix excessive memory usage when parsing bodies

In 7.5 we introduced a regression where we would allocate a 100MiB
buffer per HTTP request. This change fixes that to stream data instead.

(cherry picked from commit 080dedb)
andrewvc added a commit that referenced this pull request Jan 23, 2020
…15762)

* [Heartbeat] Fix excessive memory usage when parsing bodies

In 7.5 we introduced a regression where we would allocate a 100MiB
buffer per HTTP request. This change fixes that to stream data instead.

(cherry picked from commit 080dedb)
@andrewvc andrewvc deleted the fix-excessive-memory-usage branch January 23, 2020 13:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Heartbeat Team:obs-ds-hosted-services Label for the Observability Hosted Services team v7.6.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants