-
Notifications
You must be signed in to change notification settings - Fork 532
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
store-gateway: read series from bucket with io.Reader #4926
store-gateway: read series from bucket with io.Reader #4926
Conversation
Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>
Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>
Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>
Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>
Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>
Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>
Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>
Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>
Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>
Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>
Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>
Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>
Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>
Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>
Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>
Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>
Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>
Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>
Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>
Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>
Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>
Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>
382a90f
to
23a46a2
Compare
Marco suggested using a buffered reader in place of the regular reader. I tried with sizes of 32Ki, 64Ki, 128Ki and 32Ki showed the best performance in the benchmarks we have. So I'm using a buffer with that. I also pooled the buffered reader which showed better results with some smaller benchmarks base vs 32Ki
32Ki vs 64Ki
32Ki vs 128Ki
32Ki unpooled vs pooled
|
i had to remove series pooling because of the race condition with the index cache (which is also not detected by the race detector in tests 😬 ) With that the final benchmarks are below. Allocated bytes are down, but allocated objects are up. base vs 32Ki pooled buffer
one idea to reduce allocated objects is to still use a slab pool, but never release it - let the GC collect it after the cache is done with it |
Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>
Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>
2cf6bd7
to
567977b
Compare
I did that and we're also back to the status quo in terms of allocated objects base vs with unreleased pool
|
I suspect the race isn't detected because the cache implementation used in tests is not async. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good job! The logic LGTM. I left few last comments, but none of them is a blocker.
offsetReader.Reset(start, reader) | ||
defer offsetReader.Reset(0, nil) // don't retain the underlying reader | ||
|
||
bytesPool := pool.NewSlabPool[byte](seriesBytesSlicePool, seriesBytesSlabSize) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure we should back it with a sync.Pool
. Why not just passing an implementation of pool.Interface
which returns nil
from the Get()
and does nothing in the Put()
? It's very similar to pool.NoopBytes
. We could just call it pool.NilBytes
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I called it NoopPool, since it doesn't have to be only bytes; I experimented with doing this for labels too.
// We don't track refetches in a lot of detail because they are rare | ||
return |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure about this. Why not tracking the other stats for refetches too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think recording time spent fetching is incorrect if we record it twice, but the rest make sense (fetched series & fetched series size)
Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>
Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>
Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>
Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>
LGTM, thanks! |
Context (partitioners)
When planning to do requests to the object storage the store-gateway uses a partitioner. The partitioner can extend the fetched ranges.
Problem
When we read a byte range for series, we store the whole range in a buffer. This may result in increased allocations because the whole range can be up to 8x the size of the actual series (our series estimation at present is 64K, the partitioner can extend ranges with up to 512K; 512/64=8). In addition to that the 64K estimation is a very pessimistic one in order to prevent refetches.
This means that in the average case of 512B per series we may fetch 640 times more data than we need for small requests ((2*64K + 512K)/(2*512) = 640). We keep all of this in buffers, which increases GC pressure and CPU usage.
Also see #4593 (comment)
What this PR does
Uses an
io.Reader
to read series data from the bucket, so we never hold everything in a buffer. It also introduces pooling for same bytes.This PR introduces a reader
uvarintSequenceReader
which can keep track of its offset in a buffer and can read a uvarint from an io.Reader. I plan to introduce this reader inloadChunks
as well to simplify the code there a bit.Checklist
CHANGELOG.md
updated - the order of entries should be[CHANGE]
,[FEATURE]
,[ENHANCEMENT]
,[BUGFIX]