Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CustomProperty improvements #1261

Merged
merged 10 commits into from
Oct 6, 2020
Merged

CustomProperty improvements #1261

merged 10 commits into from
Oct 6, 2020

Conversation

eddynaka
Copy link
Contributor

@eddynaka eddynaka commented Sep 11, 2020

Refs #1206

Changes

  • Instrumntation no longer store raw objects like HttpRequest in
    Activity.CustomProperty. To enrich activity, use the Enrich action on the
    instrumentation.

Please provide a brief description of the changes here.

For significant contributions please make sure you have completed the following items:

  • CHANGELOG.md updated for non-trivial changes
  • Design discussion issue #
  • Changes in public API reviewed

@eddynaka eddynaka requested a review from a team September 11, 2020 22:04
@codecov
Copy link

codecov bot commented Sep 11, 2020

Codecov Report

Merging #1261 into master will decrease coverage by 0.39%.
The diff coverage is 44.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1261      +/-   ##
==========================================
- Coverage   77.44%   77.05%   -0.40%     
==========================================
  Files         223      223              
  Lines        6362     6428      +66     
==========================================
+ Hits         4927     4953      +26     
- Misses       1435     1475      +40     
Impacted Files Coverage Δ
...s/StackExchangeRedisCallsInstrumentationOptions.cs 100.00% <ø> (ø)
...Implementation/AspNetInstrumentationEventSource.cs 80.00% <40.00%> (-20.00%) ⬇️
...umentation.AspNet/Implementation/HttpInListener.cs 86.31% <40.00%> (-7.94%) ⬇️
...ementation/AspNetCoreInstrumentationEventSource.cs 80.00% <40.00%> (-20.00%) ⬇️
...tation.AspNetCore/Implementation/HttpInListener.cs 87.61% <40.00%> (-4.78%) ⬇️
...ent/Implementation/GrpcClientDiagnosticListener.cs 83.78% <40.00%> (-7.13%) ⬇️
...t/Implementation/GrpcInstrumentationEventSource.cs 62.50% <40.00%> (-37.50%) ⬇️
...tp/Implementation/HttpHandlerDiagnosticListener.cs 78.82% <40.00%> (-8.85%) ⬇️
...p/Implementation/HttpInstrumentationEventSource.cs 60.86% <40.00%> (-5.80%) ⬇️
...ient/Implementation/SqlClientDiagnosticListener.cs 75.36% <40.00%> (-3.10%) ⬇️
... and 9 more

@eddynaka eddynaka changed the title CustomProperty improvements [WIP] CustomProperty improvements Sep 11, 2020
@eddynaka eddynaka marked this pull request as draft September 11, 2020 23:32
@CodeBlanch
Copy link
Member

Just thinking out loud here. How would this work if you were shipping something that wanted to access the raw objects downstream? For example, let's say we are building a custom ActivityProcessor that adds extra data for http spans. The hosting application owns the options, so our custom ActivityProcessor would have to ask the user through documentation to turn these on in the host before it would work? Ideally we would have a way for it to request that extra data automatically. Could we use the runtime context somehow?

@reyang
Copy link
Member

reyang commented Sep 12, 2020

Just thinking out loud here. How would this work if you were shipping something that wanted to access the raw objects downstream? For example, let's say we are building a custom ActivityProcessor that adds extra data for http spans. The hosting application owns the options, so our custom ActivityProcessor would have to ask the user through documentation to turn these on in the host before it would work? Ideally we would have a way for it to request that extra data automatically. Could we use the runtime context somehow?

I guess the goal is to reduce the expensive dictionary operation AND the potential pressure from GC?
Consider the following approach:

  1. Expose these instrumentation raw object via runtime context, so processors (the code running in the context) could access these raw objects and cherry-pick stuff (e.g. take something from the raw object and call SetTag).
  2. Once we reached the end of the processor chain, remove all the raw objects so we don't stress GC too much.

@eddynaka
Copy link
Contributor Author

Just thinking out loud here. How would this work if you were shipping something that wanted to access the raw objects downstream? For example, let's say we are building a custom ActivityProcessor that adds extra data for http spans. The hosting application owns the options, so our custom ActivityProcessor would have to ask the user through documentation to turn these on in the host before it would work? Ideally we would have a way for it to request that extra data automatically. Could we use the runtime context somehow?

I guess the goal is to reduce the expensive dictionary operation AND the potential pressure from GC?
Consider the following approach:

  1. Expose these instrumentation raw object via runtime context, so processors (the code running in the context) could access these raw objects and cherry-pick stuff (e.g. take something from the raw object and call SetTag).
  2. Once we reached the end of the processor chain, remove all the raw objects so we don't stress GC too much.
  1. got it, will change to runtimecontext and see how that works. What would work best for the key? type+traceid? Something like that? What do you think?
  2. when you say end of the processor chain, would we do that on flush?

@reyang
Copy link
Member

reyang commented Sep 12, 2020

  1. got it, will change to runtimecontext and see how that works. What would work best for the key? type+traceid? Something like that? What do you think?

Haven't spent much time thinking about this, I guess we probably would end up with 1 key, which contains a linked list?

  1. when you say end of the processor chain, would we do that on flush?

It is when the last processor.OnEnd returned.

@eddynaka
Copy link
Contributor Author

Just thinking out loud here. How would this work if you were shipping something that wanted to access the raw objects downstream? For example, let's say we are building a custom ActivityProcessor that adds extra data for http spans. The hosting application owns the options, so our custom ActivityProcessor would have to ask the user through documentation to turn these on in the host before it would work? Ideally we would have a way for it to request that extra data automatically. Could we use the runtime context somehow?

If we use runtimecontext, so we wouldn't need the flag at all, right?

@CodeBlanch
Copy link
Member

@eddynaka I was thinking we would store the flag on runtime context to tell the instrumentation to use the custom property to store them, but open to @reyang's version where we store the actual objects on the runtime context and don't need a flag at all.

@eddynaka
Copy link
Contributor Author

  1. got it, will change to runtimecontext and see how that works. What would work best for the key? type+traceid? Something like that? What do you think?

Haven't spent much time thinking about this, I guess we probably would end up with 1 key, which contains a linked list?

  1. when you say end of the processor chain, would we do that on flush?

It is when the last processor.OnEnd returned.

  1. What if we change from LinkedList to Dictionary? At least this way we would have O(1) based on some key to remove from the RuntimeContext.

@eddynaka
Copy link
Contributor Author

@reyang @CodeBlanch , pushed one change moving to RuntimeContextSlot. Still a working in progress. But, the general idea:

  • so, i would save as object, so the activityprocessor wouldn't need to import system.web for example.
  • added as a list, so for each trace we could have multiple raw objects
  • added as a dictionary, so we would have O(1) to fetch and discard in onend

My thoughts:

  • thinking about concurrency. I will probably change to ConcurrentDictionary
  • thinking about ordering. If ActivyProcessor occurs before the instrumentation, then we would have an exception.

@CodeBlanch
Copy link
Member

@eddynaka RuntimeContext is essentially bound to a single trace. For example, check out how Baggage is implemented. It doesn't worry about other traces or clearing at the end of the trace. The not clearing at the end is trippy. I wrote a bunch of tests when I was doing Baggage to see if it ever leaked to other threads, couldn't ever get it to leak.

Esstentially what I'm saying is you can remove the cleanup part and switch to:

private static readonly RuntimeContextSlot<List<object>> RuntimeContextSlot = RuntimeContext.RegisterSlot<List<object>>("OTel.Runtime.Context");

(@reyang Can you confirm the cleanup part?)

BUT, I don't think we have really accomplished anything. We just switched allocation from Activity to RuntimeContext. What we really need is a way for some library component (ActivityProcessor, Exporter, Instrumentation, etc) or user to opt into the behavior globally and then the instrumentation adds it based on the behavior being triggered?

@CodeBlanch
Copy link
Member

One more thing I just realized...

activity.SetResource(this.resource);

activity.SetResource(this.resource);

We set the Resource on every sampled Activity we see, which is actually calling SetCustomProperty. So if the goal of this work is to avoid the automatic allocation of the internal CustomProperty Dictionary on every Activity, we also need to solve that 😄

@eddynaka
Copy link
Contributor Author

Applying the change to RuntimeContext and doing a benchmark, I could see that the Allocated memory increased:

Before (master branch):

Method Mean Error StdDev Gen 0 Gen 1 Gen 2 Allocated
InstrumentedHttpClient 186.1 us 3.72 us 5.79 us 2.1973 - - 4.68 KB
InstrumentedHttpClientWithParentActivity 179.6 us 3.00 us 2.34 us 2.4414 - - 5.99 KB

Actual (Runtime Context):

Method Mean Error StdDev Gen 0 Gen 1 Gen 2 Allocated
InstrumentedHttpClient 179.0 us 3.51 us 2.93 us 2.1973 - - 4.81 KB
InstrumentedHttpClientWithParentActivity 177.5 us 2.70 us 2.52 us 2.4414 - - 6.13 KB

@eddynaka
Copy link
Contributor Author

I think, in general, the idea of the flag was to prevent to write somewhere (activity, rutimecontext, etc). With that, if we had that flag as false, we would have less allocations (theoretically).

If condition:

Method Mean Error StdDev Gen 0 Gen 1 Gen 2 Allocated
InstrumentedHttpClient 178.6 us 2.17 us 1.93 us 2.1973 - - 4.63 KB
InstrumentedHttpClientWithParentActivity 180.8 us 3.57 us 6.80 us 2.4414 - - 5.95 KB

so, we would have a little less allocation comparing to master branch.

@cijothomas
Copy link
Member

cijothomas commented Sep 14, 2020

Two primary goals:

  1. Check if storing the raw object (in customprop or context or anywhere else) - is an expensive thing.
    If yes, then make it opt-in. (this also means to benefit from this feature, its not enough to write a processor, but also change the instrumentation options.)
  2. Cleanup part - we certainly don't want to store HttpWebRequest inside activity forever. If exporters are slow, then this HttpWebRequest won't be claimed by GC for a potentially long time. If runtimecontext is naturally solving this problem, without any explicit cleanup, then lets go for it.

I also think of another alternative:
Similar to Instrumentation's Filter API, we could introduce a Enrich api on instrumentation, which is a callback which gets passed the activity and the corresponding raw object. Users can register Enrich callback, if they want to enrich activity with anything from the raw objects. This also avoids storing any raw object anywhere, so no question of cleaning up part.

@eddynaka
Copy link
Contributor Author

eddynaka commented Sep 14, 2020

Two primary goals:

  1. Check if storing the raw object (in customprop or context or anywhere else) - is an expensive thing.
    If yes, then make it opt-in. (this also means to benefit from this feature, its not enough to write a processor, but also change the instrumentation options.)
  2. Cleanup part - we certainly don't want to store HttpWebRequest inside activity forever. If exporters are slow, then this HttpWebRequest won't be claimed by GC for a potentially long time. If runtimecontext is naturally solving this problem, without any explicit cleanup, then lets go for it.

I also think of another alternative:
Similar to Instrumentation's Filter API, we could introduce a Enrich api on instrumentation, which is a callback which gets passed the activity and the corresponding raw object. Users can register Enrich callback, if they want to enrich activity with anything from the raw objects. This also avoids storing any raw object anywhere, so no question of cleaning up part.

If we go to this direction, we can do in Options file:

/// <summary>
/// Gets or sets a Subscribe function.
/// </summary>
public Action<object> Subscribe { get; set; }

With that, we can simply subscribe for events like this:

using (Sdk.CreateTracerProviderBuilder()
    .AddHttpClientInstrumentation(options =>
    {
        options.Subscribe = (req) => { Console.WriteLine(req); };
    })
    .AddProcessor(activityProcessor.Object)
    .Build())
{
    using var c = new HttpClient();
    using var r = await c.GetAsync("https://opentelemetry.io/").ConfigureAwait(false);
}

For the instrumentation itself:

if (activity.IsAllDataRequested)
{
    this.options?.Subscribe(request);

    activity.SetTag(SemanticConventions.AttributeHttpMethod, HttpTagHelper.GetNameForHttpMethod(request.Method));
    activity.SetTag(SemanticConventions.AttributeHttpHost, HttpTagHelper.GetHostTagValueFromRequestUri(request.RequestUri));
    activity.SetTag(SemanticConventions.AttributeHttpUrl, request.RequestUri.OriginalString);

    if (this.options.SetHttpFlavor)
    {
        activity.SetTag(SemanticConventions.AttributeHttpFlavor, HttpTagHelper.GetFlavorTagValueFromProtocolVersion(request.Version));
    }
}

With that, the customer would have access to those properties, and we wouldn't have any issue with performance / allocation.

Below, the benchmark:

Method Mean Error StdDev Gen 0 Gen 1 Gen 2 Allocated
InstrumentedHttpClient 230.4 us 4.47 us 6.55 us 2.1973 - - 4.48 KB
InstrumentedHttpClientWithParentActivity 230.8 us 4.55 us 5.42 us 2.4414 - - 5.79 KB

I can see that Mean/Error/StdDev increased. But, allocated decreased.

What do you think? @cijothomas @reyang @CodeBlanch

@CodeBlanch
Copy link
Member

@eddynaka Regarding this API:

public Action<object> Subscribe { get; set; }

I've started to build this API before and ended up backing off because there's a surprising amount of things to deal with. You have 3 things you might want to pull data from: request, response, & (potentially) an exception. Should we have an enrichment function for all three things? That seems like a lot. But if you combine it all, it starts to get a little messy:

public Action<Activity, HttpContext, Exception> ActivityEnrichmentAction { get; set; }

(That is passing HttpContext to combine at least request & response.)

Just some things to consider 😄

@eddynaka
Copy link
Contributor Author

@eddynaka Regarding this API:

public Action<object> Subscribe { get; set; }

I've started to build this API before and ended up backing off because there's a surprising amount of things to deal with. You have 3 things you might want to pull data from: request, response, & (potentially) an exception. Should we have an enrichment function for all three things? That seems like a lot. But if you combine it all, it starts to get a little messy:

public Action<Activity, HttpContext, Exception> ActivityEnrichmentAction { get; set; }

(That is passing HttpContext to combine at least request & response.)

Just some things to consider 😄

Hi @CodeBlanch , that's why I added as object. And I would pass the request/response/exception/other types if needed.

And the user would have to typeof/gettype to see what is it to do something.

What is a surprise for me is the increase of time.

@CodeBlanch
Copy link
Member

I'll look at it a bit more closely later to try and understand the performance. Giving user an object works, it just isn't that nice of an API to code against. Forces them to do casting, type checking, maybe even reflection. Things that cost performance. Preferable to have something that exposes a strongly-typed API.

@eddynaka
Copy link
Contributor Author

I'll look at it a bit more closely later to try and understand the performance. Giving user an object works, it just isn't that nice of an API to code against. Forces them to do casting, type checking, maybe even reflection. Things that cost performance. Preferable to have something that exposes a strongly-typed API.

Hi @CodeBlanch , did you have time to take a look? Should I go and maintain the if / Action / other? Let me know! :)

@CodeBlanch
Copy link
Member

@eddynaka Hey I'm kind of confused where are on this. I don't see the Subscribe callback on the branch so I can't perf test it. Can you throw that up somewhere so I can take it for a spin?

Right now I see:

  • Options flag StoreRawObjectInActivity.
  • We're storing on RuntimeContext.

That doesn't make a whole lot of sense to me. I think if we have the flag, we should store using activity.SetCustomProperty. My idea to use RuntimeContext was basically a way to get around needing the flag. It was originally to move the flag to RuntimeContext and use the activity for the storage.

Maybe we should do a call on this to get everyone on the same page with the design?

@eddynaka
Copy link
Contributor Author

@eddynaka Hey I'm kind of confused where are on this. I don't see the Subscribe callback on the branch so I can't perf test it. Can you throw that up somewhere so I can take it for a spin?

Right now I see:

  • Options flag StoreRawObjectInActivity.
  • We're storing on RuntimeContext.

That doesn't make a whole lot of sense to me. I think if we have the flag, we should store using activity.SetCustomProperty. My idea to use RuntimeContext was basically a way to get around needing the flag. It was originally to move the flag to RuntimeContext and use the activity for the storage.

Maybe we should do a call on this to get everyone on the same page with the design?

Hum.. i think I didn't push it.

What I will do in this branch:

  1. nplicate httpclient instrumentation
  2. add to benchmark and we can choose

@eddynaka
Copy link
Contributor Author

Below a simple benchmark:

Method Mean Error StdDev Gen 0 Gen 1 Gen 2 Allocated
SimpleHttpClient 169.5 us 3.31 us 5.88 us 1.4648 - - 1.84 KB
InstrumentedHttpClientFlag 193.3 us 3.74 us 4.86 us 1.9531 - - 4.13 KB
InstrumentedHttpClientWithParentActivityFlag 192.9 us 3.76 us 4.76 us 2.1973 - - 4.93 KB
InstrumentedHttpClientRuntime 189.4 us 3.78 us 4.36 us 2.1973 - - 4.3 KB
InstrumentedHttpClientWithParentActivityRuntime 194.9 us 3.82 us 5.22 us 2.1973 - - 5.11 KB
InstrumentedHttpClientSubscribe 189.1 us 3.67 us 3.92 us 1.9531 - - 4.13 KB
InstrumentedHttpClientWithParentActivitySubscribe 191.8 us 3.48 us 3.87 us 2.1973 - - 4.93 KB

If we go with flag: the activity gets big
If we go with subscribe: this can get slow, since it depends in what logic you add in the Action
If we go with Runtime: it's restricted to OTel, right?

@CodeBlanch
Copy link
Member

@eddynaka Apologies for not looking at this sooner!

@cijothomas To answer the fundamental question, how expensive is SetCustomProperty? I made a few quick low-level benchmarks:

        [Benchmark]
        public void BasicActivity()
        {
            using Activity activity = new Activity("Activity");
        }

        [Benchmark]
        public void ActivityWithSingleCustomProperty()
        {
            using Activity activity = new Activity("Activity");
            activity.SetCustomProperty("activity", activity);
        }

        [Benchmark]
        public void ActivityWithManyCustomProperties()
        {
            using Activity activity = new Activity("Activity");
            activity.SetCustomProperty("activity1", activity);
            activity.SetCustomProperty("activity2", activity);
            activity.SetCustomProperty("activity3", activity);
        }

Results:

Method Mean Error StdDev Gen 0 Gen 1 Gen 2 Allocated
BasicActivity 5.156 us 0.0966 us 0.0904 us 0.0381 - - 376 B
ActivityWithSingleCustomProperty 5.246 us 0.0530 us 0.0495 us 0.0687 - - 584 B
ActivityWithManyCustomProperties 5.461 us 0.0544 us 0.0482 us 0.0687 - - 584 B

The results make sense to me. The first one you add, big hit. It has to create the storage. Anything added after that, negligible.

Right now we add the Resource on all sampled non-suppressed Activity objects:
https://github.com/open-telemetry/opentelemetry-dotnet/blob/master/src/OpenTelemetry/Trace/TracerProviderSdk.cs#L78

That means we are already paying the big price, regardless of what the instrumentation does. Unless we change that, all of this work is moot. But I was chatting with @eddynaka on Gitter earlier and his goal is to fix that 😄

Right now we have 3 options on this PR:

  1. Add a flag to control setting the raw objects as custom properties.
  2. Add custom properties to RuntimeContext.
  3. Add a callback user can use via options to enrich Activity using the raw objects.

None of them are absolutely perfect. 1 & 2 allow you to plug in an ActivityProcessor to enrich based on the raw sources. Given the spec says ActivityProcessor is our extension point for enrichment, it feels to me like we should support that. But Option 3 is probably more useful for most users to quickly enrich without having to build an ActivityProcessor.

I'm thinking we scope this PR to Option 3 and then maybe we can do something to enable the other case?

That being said, I really don't like the API...

Action<object> Enrich { get; set; }

User will have to do expensive checks to figure out if object is request, response, or exception. They also need to access Activity.Current.

I think at the minimum it should be...

Action<Activity, string, object> Enrich { get; set; }

Where the string is the event name. That would help with knowing what object is going to be.

I would prefer something strongly typed but that also comes with some challenges.

updating tests

updating to RuntimeContextSlot

updating to runtimeContext

adding runtime/subscibe/flag benchmark

updating aspnet and http

updating ramaining instrumentation
@eddynaka eddynaka self-assigned this Oct 4, 2020
@eddynaka eddynaka marked this pull request as ready for review October 4, 2020 11:18
@eddynaka eddynaka changed the title [WIP] CustomProperty improvements CustomProperty improvements Oct 4, 2020
@cijothomas
Copy link
Member

This PR looks good to me, assuming its implement option 3 @CodeBlanch mentioned.
This avoids touching Activity.CustomProperty and should save perf (of course assuming we get rid of every other usage of customproperty as well)
No need of worrying about cleaning up the strong objects.

We can discuss alternate namings, comments, example usage in a subsequent PR.

@@ -112,7 +112,7 @@ public override void OnStartActivity(Activity activity, object payload)

if (activity.IsAllDataRequested)
{
activity.SetCustomProperty(RequestCustomPropertyName, request);
this.options.Enrich?.Invoke(activity, "OnStartActivity", request);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also - can you ensure that unit test validate that Enrich is called only if Activity is sampled-in.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just added a test. Let me know what do you think before I update everything

Copy link
Member

@CodeBlanch CodeBlanch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@cijothomas cijothomas merged commit 874f9d7 into open-telemetry:master Oct 6, 2020
@eddynaka eddynaka deleted the feature/custom-property-improvements branch October 17, 2020 00:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants