Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Empty queued spans when ForceFlush called #2335

Merged
merged 9 commits into from
Nov 5, 2021
Merged

Empty queued spans when ForceFlush called #2335

merged 9 commits into from
Nov 5, 2021

Conversation

pkwarren
Copy link
Contributor

Update the implementation of ForceFlush() to first ensure that all spans
which are queued are added to the batch before calling export spans.
Create a small ReadOnlySpan implementation which can be used as a marker
that ForceFlush has been invoked and used to notify when all spans are
ready to be exported.

Fixes #2080.

Update the implementation of ForceFlush() to first ensure that all spans
which are queued are added to the batch before calling export spans.
Create a small ReadOnlySpan implementation which can be used as a marker
that ForceFlush has been invoked and used to notify when all spans are
ready to be exported.

Fixes #2080.
@linux-foundation-easycla
Copy link

linux-foundation-easycla bot commented Oct 28, 2021

CLA Signed

The committers are authorized under a signed CLA.

Copy link
Contributor

@seanschade seanschade left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the simplicity of this solution.

@Aneurysm9 Aneurysm9 added the blocked:CLA Waiting on CLA to be signed before progress can be made label Oct 29, 2021
Copy link
Member

@Aneurysm9 Aneurysm9 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a CHANGELOG entry describing this change and sign the CLA.

Copy link
Contributor

@MrAlias MrAlias left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ForceFlush method should not shut down the processor.

CHANGELOG.md Outdated Show resolved Hide resolved
sdk/trace/batch_span_processor.go Outdated Show resolved Hide resolved
pkwarren and others added 3 commits October 29, 2021 11:59
Co-authored-by: Tyler Yahn <MrAlias@users.noreply.github.com>
Co-authored-by: Tyler Yahn <MrAlias@users.noreply.github.com>
MrAlias
MrAlias previously approved these changes Oct 29, 2021
Copy link
Contributor

@MrAlias MrAlias left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please sign the CLA so we can accept these changes.

// ForceFlush exports all ended spans that have not yet been exported.
func (bsp *batchSpanProcessor) ForceFlush(ctx context.Context) error {
var err error
if bsp.e != nil {
flushCh := make(chan struct{})
select {
case bsp.queue <- forceFlushSpan{flushed: flushCh}:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe use the enque method here instead to handle the shutdown case.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Though this blocks on a full queue ... It might make sense to copy some of the logic from that method to here to handle the case where shutdown has already occurred or is concurrently happening.

@MrAlias MrAlias dismissed their stale review October 29, 2021 17:22

There is still more that needs to be changed here.

@pkwarren
Copy link
Contributor Author

Please sign the CLA so we can accept these changes.

Will do this ASAP - waiting on legal sign off.

}

func (f forceFlushSpan) SpanContext() trace.SpanContext {
return trace.NewSpanContext(trace.SpanContextConfig{TraceFlags: trace.FlagsSampled})
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using enqueue requires that the span being queued is marked as sampled.

// ForceFlush exports all ended spans that have not yet been exported.
func (bsp *batchSpanProcessor) ForceFlush(ctx context.Context) error {
var err error
if bsp.e != nil {
flushCh := make(chan struct{})
bsp.enqueueBlockOnQueueFull(ctx, forceFlushSpan{flushed: flushCh}, true)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pass true as the last argument here since we want to block for this span to be queued.

@@ -296,6 +320,10 @@ func (bsp *batchSpanProcessor) drainQueue() {
}

func (bsp *batchSpanProcessor) enqueue(sd ReadOnlySpan) {
bsp.enqueueBlockOnQueueFull(context.TODO(), sd, bsp.o.BlockOnQueueFull)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In order to work in the ForceFlush method, it was needed to pass a context (to allow cancellation).

Update the handling of the force flush span so we only wait on the
channel if we were able to enqueue the span to the queue.
@pkwarren
Copy link
Contributor Author

pkwarren commented Nov 3, 2021

Sorry for delay on CLA - that is now approved. Does this need a CI build to be approved now?

@codecov
Copy link

codecov bot commented Nov 3, 2021

Codecov Report

Merging #2335 (486120a) into main (7ce58f3) will increase coverage by 0.1%.
The diff coverage is 82.6%.

Impacted file tree graph

@@           Coverage Diff           @@
##            main   #2335     +/-   ##
=======================================
+ Coverage   73.6%   73.8%   +0.1%     
=======================================
  Files        175     175             
  Lines      12419   12436     +17     
=======================================
+ Hits        9144    9179     +35     
+ Misses      3041    3020     -21     
- Partials     234     237      +3     
Impacted Files Coverage Δ
sdk/trace/batch_span_processor.go 83.7% <82.6%> (-0.7%) ⬇️
sdk/trace/provider.go 88.4% <0.0%> (+4.8%) ⬆️
sdk/trace/snapshot.go 96.9% <0.0%> (+42.4%) ⬆️

@pkwarren
Copy link
Contributor Author

pkwarren commented Nov 3, 2021

There is a test TestNew_collectorConnectionDiesThenReconnects failing with this PR when running the tests with the race detector in exporters/otlp/otlptrace/otlptracegrpc. I'm able to reproduce the same failure locally on the main branch with this command:

$ cd exporters/otlp/otlptrace/otlptracegrpc
$ for in in `seq 1 100`; do go test ./... -run '^\QTestNew_collectorConnectionDiesThenReconnects\E$' -count 1 -race -v; done

Is this a known flaky test? I also saw it referenced in #2285.

@Aneurysm9 Aneurysm9 removed the blocked:CLA Waiting on CLA to be signed before progress can be made label Nov 4, 2021
@Aneurysm9
Copy link
Member

I think there have been a couple attempts to address that flake test, but it has thus far eluded correction.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Guarantee ForceFlush of BatchSpanProcessor will export all ended spans
6 participants