Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Limit the maximum size of block uploads #4680

Merged
merged 8 commits into from
Apr 28, 2023

Conversation

andyasp
Copy link
Contributor

@andyasp andyasp commented Apr 6, 2023

What this PR does

Adds a limit for the maximum byte size for a block upload based off the block metadata. The sizes in the metadata are ensured to match the actual file sizes elsewhere already (during each file upload and during validation).

Which issue(s) this PR fixes or relates to

N/A

Checklist

  • Tests updated
  • Documentation added
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]

Copy link
Contributor

@dimitarvdimitrov dimitarvdimitrov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

WDYT about making this option advanced? I imagine users will hit the limit, google the error, find the runbook, and up the limit. Making it an advanced option IMO takes away some of the cognitive burden when reading the list of config options.


if blockSizeBytes > maxBlockSizeBytes || blockSizeBytes < 0 {
err := fmt.Errorf(maxBlockUploadSizeBytesFormat, maxBlockSizeBytes)
level.Error(logger).Log("err", err, "size", blockSizeBytes)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wouldn't this error be also logged here?

level.Error(logger).Log("msg", "error while validating block", "err", err)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WDYT about making this option advanced?

All for it, thanks I hadn't thought about the category when originally writing this.

wouldn't this error be also logged here?

I'll look into this. The only reason I added this log line was to give visibility into the observed block size, but I didn't want to communicate that back to the user.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll look into this. The only reason I added this log line was to give visibility into the observed block size, but I didn't want to communicate that back to the user.

i wouldn't mind having a more verbose error for the user. I was more concerned about noisy logging. But not feeling strongly about this, so however you decide

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree that it seems unnecessary to hide computed block size from user.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I generally defer to providing less information in errors to reduce adding surface area to an API. I thought logging on the server side was more useful because the rejections could be analyzed all together, but the information provided to the user is less relevant and possibly even confusing (the metadata size isn't counted, an overflow would return a negative number and may not be representative of the entire sum).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I generally defer to providing less information in errors to reduce adding surface area to an API. I thought logging on the server side was more useful because the rejections could be analyzed all together, but the information provided to the user is less relevant and possibly even confusing (the metadata size isn't counted, an overflow would return a negative number and may not be representative of the entire sum).

I don't mind if we report overflown value to the user (realistically that will never happen for "normal" blocks), and we can avoid confusion by better wording saying that it doesn't include block metadata.

The point is that we already log errors returned to user, so why bother logging it once more with extra details, if this detail is something user can figure out themselves and is not sensitive in any way.

(But this is a nit, we can keep it as is)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking about this some more. Rather than returning the size of the rejection maybe a histogram of block sizes proposed in start upload would be generally more useful. If it included rejected sizes as well it could provide insight on what the limit could be moved to without the burden of interpreting log lines/errors. Overflow might have to be excluded to avoid negatives though, but that seems fine.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than returning the size of the rejection maybe a histogram of block sizes proposed in start upload would be generally more useful.

Histogram will only give us aggregated values, which is fine, but we can also log all block sizes to have specific values.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can move it back to how it originally was then without logging the err.

@andyasp andyasp marked this pull request as draft April 11, 2023 19:59
pkg/compactor/block_upload.go Show resolved Hide resolved
pkg/compactor/block_upload.go Outdated Show resolved Hide resolved

if blockSizeBytes > maxBlockSizeBytes || blockSizeBytes < 0 {
err := fmt.Errorf(maxBlockUploadSizeBytesFormat, maxBlockSizeBytes)
level.Error(logger).Log("err", err, "size", blockSizeBytes)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree that it seems unnecessary to hide computed block size from user.

@@ -40,6 +42,10 @@ const (
validationHeartbeatTimeout = 5 * time.Minute // Maximum duration of time to wait until a validation is able to be restarted
)

var maxBlockUploadSizeBytesFormat = globalerror.CompactorBlockUploadMaxBlockSizeBytes.MessageWithPerTenantLimitConfig(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No other error reported during block upload is currently using this way of reporting errors to user. I would suggest to tackle converting all block upload errors to the new system in a separate PR, and use "simple" way of error reporting in this PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to the simple way.

pkg/util/globalerror/errors.go Outdated Show resolved Hide resolved
@lamida
Copy link
Contributor

lamida commented Apr 17, 2023

The CHANGELOG has just been cut to prepare for the next Mimir release. Please rebase main and eventually move the CHANGELOG entry added / updated in this PR to the top of the CHANGELOG document. Thanks!

@andyasp andyasp force-pushed the aasp/block-upload-max-block-size branch from 42816b5 to 45f5d58 Compare April 20, 2023 20:33
@andyasp andyasp marked this pull request as ready for review April 20, 2023 20:34
Copy link
Member

@pstibrany pstibrany left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

@pstibrany pstibrany enabled auto-merge (squash) April 28, 2023 14:28
@pstibrany pstibrany merged commit 84fa681 into main Apr 28, 2023
@pstibrany pstibrany deleted the aasp/block-upload-max-block-size branch April 28, 2023 14:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants