Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Calculate synthetic size worker #3123

Merged
merged 6 commits into from
Jan 13, 2023
Merged

Conversation

lubennikovaav
Copy link
Contributor

@lubennikovaav lubennikovaav commented Dec 15, 2022

New version of the PR contains following changes:

TODO:

  • remove test test_metric_collection_multiple_branches . It is there only to illustrate the bug.
  • clean the code (remove debug info! messages)

Copy link
Member

@koivunej koivunej left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apologies, had missed the github notification. I think this could just as well live outside the tenant, in metric collection.

pageserver/src/tenant.rs Outdated Show resolved Hide resolved
pageserver/src/tenant.rs Show resolved Hide resolved
@lubennikovaav lubennikovaav force-pushed the collect_billing_metrics branch 2 times, most recently from 3e0e4a3 to dbb6eb4 Compare December 20, 2022 19:12
Base automatically changed from collect_billing_metrics to main December 20, 2022 20:59
@lubennikovaav lubennikovaav marked this pull request as ready for review January 11, 2023 19:57
@lubennikovaav lubennikovaav requested review from a team as code owners January 11, 2023 19:57
@lubennikovaav lubennikovaav requested review from knizhnik and removed request for a team January 11, 2023 19:57
Copy link
Contributor

@knizhnik knizhnik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not explained why it was decided to calculate logical size in background.
Before it is calculated incrementally.
May be I missed something but logical=synthetic size is one we are limiting for free tier and what customers are paying for. If so, is it acceptable to calculate it once per 10 minutes?
So I do some massive insert, what to check used size but see... old value.
Once again sorry if it is explained somewhere and I just missed it.

pageserver/src/consumption_metrics.rs Outdated Show resolved Hide resolved
pageserver/src/tenant/size.rs Outdated Show resolved Hide resolved
pageserver/src/tenant/size.rs Outdated Show resolved Hide resolved
pageserver/src/tenant/size.rs Show resolved Hide resolved
pageserver/src/tenant.rs Outdated Show resolved Hide resolved
pageserver/src/tenant.rs Outdated Show resolved Hide resolved
synthetic size calculation.
Add new pageserver config param calculate_synthetic_size_interval
    sort updates in topological order so that the parent timeline always preceeds its children.
    fixes #3179
- handle errors in calculate_synthetic_size_worker. Don't exit the bgworker if one tenant failed.

- add cached_synthetic_tenant_size to cache values calculated by the bgworker

- code cleanup: remove unneeded info! messages, clean comments

- handle collect_metrics_task() error. Don't exit collect_metrics worker if one task failed.

 - add unit test to cover case when we have multiple branches at the same lsn
@lubennikovaav lubennikovaav merged commit c6d383e into main Jan 13, 2023
@lubennikovaav lubennikovaav deleted the calculate_synthetic_size_worker branch January 13, 2023 09:51
@hlinnaka
Copy link
Contributor

It is not explained why it was decided to calculate logical size in background.
Before it is calculated incrementally.

We still track logical size at the end of the branch incrementally. However, the synthetic size calculation needs the logical size at an older point in time.

May be I missed something but logical=synthetic size is one we are limiting for free tier and what customers are paying for. If so, is it acceptable to calculate it once per 10 minutes?

Logical size is different from the synthetic size.

So I do some massive insert, what to check used size but see... old value.
Once again sorry if it is explained somewhere and I just missed it.

We don't have a good explanation of synthetic size anywhere, so the confusion is understandable.

I wrote a small doc on what the synthetic size is: #3328. It needs more work, but I hope it helps.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants