Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor Adaptive Sampling Aggregator & Strategy Store #5441

Merged
merged 19 commits into from
May 21, 2024

Conversation

Pushkarm029
Copy link
Member

Which problem is this PR solving?

Description of the changes

  • processor is co-located in strategy_store and aggregator.
    • In aggregator to run generateStrategyResponses, runCalculationLoop.
    • In strategy_store to run loadProbabilities, runUpdateProbabilitiesLoop

How was this change tested?

  • make test

Checklist

@Pushkarm029 Pushkarm029 requested a review from a team as a code owner May 11, 2024 20:32
@Pushkarm029 Pushkarm029 requested a review from albertteoh May 11, 2024 20:32
Copy link

codecov bot commented May 11, 2024

Codecov Report

Attention: Patch coverage is 96.93878% with 3 lines in your changes are missing coverage. Please review.

Project coverage is 95.53%. Comparing base (486f74f) to head (9c412a5).

Files Patch % Lines
...ugin/sampling/strategystore/adaptive/aggregator.go 90.00% 1 Missing and 1 partial ⚠️
...lugin/sampling/strategystore/adaptive/processor.go 96.96% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #5441      +/-   ##
==========================================
+ Coverage   95.50%   95.53%   +0.02%     
==========================================
  Files         331      331              
  Lines       16139    16155      +16     
==========================================
+ Hits        15414    15434      +20     
+ Misses        552      547       -5     
- Partials      173      174       +1     
Flag Coverage Δ
badger_v1 8.05% <ø> (ø)
badger_v2 1.93% <ø> (ø)
cassandra-3.x-v1 16.43% <ø> (ø)
cassandra-3.x-v2 1.85% <ø> (ø)
cassandra-4.x-v1 16.43% <ø> (ø)
cassandra-4.x-v2 1.85% <ø> (ø)
elasticsearch-7.x 1.78% <ø> (ø)
elasticsearch-8.x 1.76% <ø> (-0.02%) ⬇️
grpc_v1 9.53% <ø> (ø)
grpc_v2 7.58% <ø> (+0.01%) ⬆️
kafka 9.78% <ø> (ø)
opensearch-1.x 1.78% <ø> (ø)
opensearch-2.x 1.78% <ø> (+0.01%) ⬆️
unittests 93.93% <96.93%> (+0.03%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Member

@yurishkuro yurishkuro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find it difficult to reason about this change. I am on board with not moving large portions of the code across files, but I would prefer to split the functionality of the Processor more cleanly into two separate structs. We can just change the receivers in the methods to different structs to avoid large code changes:

  • the parts that will be run in OTEL Processor (along with aggregator) can be isolated to a type called PostAggregator (for now).
  • the parts that will be run in the OTEL Extension can be in a struct called StrategyStore

@Pushkarm029 Pushkarm029 force-pushed the sampling_refactoring branch from 4a4039d to 5ddcaa9 Compare May 12, 2024 08:43
Signed-off-by: Pushkar Mishra <pushkarmishra029@gmail.com>
@Pushkarm029 Pushkarm029 reopened this May 12, 2024
Signed-off-by: Pushkar Mishra <pushkarmishra029@gmail.com>
Signed-off-by: Pushkar Mishra <pushkarmishra029@gmail.com>
Signed-off-by: Pushkar Mishra <pushkarmishra029@gmail.com>
Signed-off-by: Pushkar Mishra <pushkarmishra029@gmail.com>
Signed-off-by: Pushkar Mishra <pushkarmishra029@gmail.com>
Signed-off-by: Pushkar Mishra <pushkarmishra029@gmail.com>
@Pushkarm029
Copy link
Member Author

Not able to fix these goroutine leaks: close of closed channel

Signed-off-by: Pushkar Mishra <pushkarmishra029@gmail.com>
@Pushkarm029 Pushkarm029 marked this pull request as draft May 15, 2024 18:19
Signed-off-by: Pushkar Mishra <pushkarmishra029@gmail.com>
@Pushkarm029 Pushkarm029 marked this pull request as ready for review May 15, 2024 21:33
plugin/sampling/strategystore/adaptive/strategy_store.go Outdated Show resolved Hide resolved
plugin/sampling/strategystore/adaptive/processor.go Outdated Show resolved Hide resolved
// Start initializes and starts the sampling processor which regularly calculates sampling probabilities.
func (p *Processor) Start() error {
// Start initializes and starts the PostAggregator which regularly calculates sampling probabilities.
func (p *PostAggregator) Start() error {
p.logger.Info("starting adaptive sampling processor")
if err := p.electionParticipant.Start(); err != nil {
return err
}
p.shutdown = make(chan struct{})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like there's nothing to start here, which makes sense since we just want the aggregator to call post-aggregator on every tick

return nil
}

func (p *Processor) runBackground(f func()) {
func (p *PostAggregator) runBackground(f func()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it used at all?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but only in one function: runCalculation. I removed it and added it as inline.

plugin/sampling/strategystore/adaptive/processor.go Outdated Show resolved Hide resolved
plugin/sampling/strategystore/adaptive/aggregator.go Outdated Show resolved Hide resolved
plugin/sampling/strategystore/adaptive/strategy_store.go Outdated Show resolved Hide resolved
plugin/sampling/strategystore/adaptive/strategy_store.go Outdated Show resolved Hide resolved
plugin/sampling/strategystore/adaptive/processor.go Outdated Show resolved Hide resolved
Signed-off-by: Pushkar Mishra <pushkarmishra029@gmail.com>
Signed-off-by: Pushkar Mishra <pushkarmishra029@gmail.com>
Signed-off-by: Pushkar Mishra <pushkarmishra029@gmail.com>
Copy link
Member

@yurishkuro yurishkuro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hadn't had a chance to do a full review of this version

plugin/sampling/strategystore/adaptive/aggregator.go Outdated Show resolved Hide resolved
plugin/sampling/strategystore/adaptive/aggregator.go Outdated Show resolved Hide resolved
Signed-off-by: Pushkar Mishra <pushkarmishra029@gmail.com>
@Pushkarm029
Copy link
Member Author

Pushkarm029 commented May 19, 2024

🎉Exams Over, let's finish this.

Signed-off-by: pushkarm029 <pushkarmishra029@gmail.com>
Copy link
Member

@yurishkuro yurishkuro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you need to regenerate the mocks since you added new Close method to the interface

otherwise looks good to be overall, let's make the build greed.

plugin/sampling/strategystore/adaptive/factory.go Outdated Show resolved Hide resolved
go func() {
p.saveProbabilitiesAndQPS()
p.bgFinished.Done()
}()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need to do this in the background when the parent function is already invoked in a loop?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Running in the background allows the loop in runAggregationLoop to continue iterating without being blocked by the saving operation.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but it is one logical operation, if you break them apart you introduce race conditions

Copy link
Member Author

@Pushkarm029 Pushkarm029 May 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, removed it now.

Signed-off-by: pushkarm029 <pushkarmishra029@gmail.com>
Signed-off-by: pushkarm029 <pushkarmishra029@gmail.com>
Signed-off-by: pushkarm029 <pushkarmishra029@gmail.com>
Signed-off-by: pushkarm029 <pushkarmishra029@gmail.com>
@yurishkuro yurishkuro merged commit c9fe891 into jaegertracing:main May 21, 2024
38 checks passed
@Pushkarm029 Pushkarm029 deleted the sampling_refactoring branch May 22, 2024 05:16
yurishkuro pushed a commit that referenced this pull request May 24, 2024
## Which problem is this PR solving?
-
#5441 (comment)

## Description of the changes
- Rename `Processor` to `PostAggregator`

## How was this change tested?
- `make test`

## Checklist
- [x] I have read
https://github.com/jaegertracing/jaeger/blob/master/CONTRIBUTING_GUIDELINES.md
- [x] I have signed all commits
- [ ] I have added unit tests for the new functionality
- [x] I have run lint and test steps successfully
  - for `jaeger`: `make lint test`
  - for `jaeger-ui`: `yarn lint` and `yarn test`

Signed-off-by: pushkarm029 <pushkarmishra029@gmail.com>
varshith257 pushed a commit to varshith257/jaeger that referenced this pull request Jun 2, 2024
## Which problem is this PR solving?
-
jaegertracing#5441 (comment)

## Description of the changes
- Rename `Processor` to `PostAggregator`

## How was this change tested?
- `make test`

## Checklist
- [x] I have read
https://github.com/jaegertracing/jaeger/blob/master/CONTRIBUTING_GUIDELINES.md
- [x] I have signed all commits
- [ ] I have added unit tests for the new functionality
- [x] I have run lint and test steps successfully
  - for `jaeger`: `make lint test`
  - for `jaeger-ui`: `yarn lint` and `yarn test`

Signed-off-by: pushkarm029 <pushkarmishra029@gmail.com>
Signed-off-by: Vamshi Maskuri <gwcchintu@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants