ETW exporter how-to - usage instructions document #628

maxgolov · 2021-03-25T01:15:16Z

Related to #326

Documentation

This is a documentation that explains how to use OpenTelemetry C++ SDK with ETW exporter on Windows. The main idea is that application is instrumented using OpenTelemetry in a platform-agnostic way . And may subsequently utilize ETW exporter, if that application is running on Windows OS. I included a few examples that allow to utilize vendor-agnostic flow that is not tied to any Microsoft offering or service. This could be achieved in many various ways, but I am referring to Microsoft.Azure.EventFow as an example of a side-car that may listen to events emitted over ETW channel. The other alternate project that may be utilized as a side-car for this purpose is SilkETW.

There are two operational modes in ETW exporter:

ETW / TraceLoggingDynamic mode - so called Dynamic Manifest, already supported by any ETW Listener, best supported in C# / .NET. But it requires a header that we are working on releasing as opensource later this year. The closest analog to what it is doing is available as GoLang implementation here.
ETW / MsgPack mode - this one is entirely opensource at this point. And the usage example that I added in the document shows how it can be used today, with an out-of-proc listener that is capable of receiving ETW events, decoding their MsgPack-encoded payload to structured objects, then subsequently forwarding those objects to any destination of their choice.

I also added a mini TODO list in the Addendum section, to elaborate on further evolution plans, specifically on how OpenTelemetry C++ API calls in native code can be expressed as ETW events, that may subsequently be processed out-of-proc and forwarded to OpenTelemetry.NET, for example. That way preserving the uninterrupted flow of OpenTelemetry events further to OpenTelemetry-compatible destination, merely using ETW as a IPC channel for data passing on Windows. Because it is robust, efficient and already optimized for that purpose in the most optimal way imaginable.

There are no code changes, just the document. All code changes go in a separate PR here #519 (there are still a few moments that I'd like to finish in code before re-releasing the code for review).

Corresponding E2E example, with sample code, and ETW listener using EventFlow will be added separately to C++ contrib repo. We can discuss what prominent apps and services running on Windows (Docker, Apache, IIS) may be instrumented using that flow. Ideally I would envision the Docker replacing their ETW Logging Driver with OpenTelemetry. We can also discuss how similar approach, concepts, and structure may be applied in the implementation of a similar exporter in GoLang for Windows using OpenTelemetry Go SDK and go-winio package (in a separate PR, and in a separate repo). But many concepts shown here in C++ may be universally applicable to other languages in general.

codecov · 2021-03-25T01:27:51Z

Codecov Report

Merging #628 (16e94a1) into main (62c6b13) will not change coverage.
The diff coverage is n/a.

@@           Coverage Diff           @@
##             main     #628   +/-   ##
=======================================
  Coverage   94.49%   94.49%           
=======================================
  Files         197      197           
  Lines        9130     9130           
=======================================
  Hits         8627     8627           
  Misses        503      503

pyohannes · 2021-03-25T03:13:19Z

exporters/etw/README.md

+  // Properties is a helper class in ETW namespace that is otherwise compatible
+  // with Key-Value Iterable accepted by OpenTelemetry API. Using Properties
+  // should enable more efficient data transfer without unnecessary memcpy.


That's a bit misleading in the context of this example. As far as I can see, none of the usages of Properties here is more efficient than what the API provides out of the box. There's no memcpy involved in this case:

span->AddEvent(eventName, { {"uint32Key", (uint32_t)1234}, {"uint64Key", (uint64_t)1234567890}, {"strKey", "someValue"} });

The answer - really depends. When we need to decorate the properties with additional attributes, such as - passing non-const properties, append some common properties... Maybe a set of common properties unique to custom TracerProvider . For example - some sort of custom correlation technique that needs to be appended on every record. Let's say, some custom field called cV and stamped on every Properties object passed in; OR appending a group of other device.* or app.* fields on top.. Then we could have reused the original container that was created in the customer code once, then passing it around as non-const by reference on that same thread. Thus, not requiring a transform to Recordable , and not needing the key-value iteration on it. And not needing to construct it in the SDK from initializer list as an argument...

I think I should have moved this comment from Attributes down to event Properties below, to make a stronger case 😄 Because with Attributes, it appears that what you are suggesting is indeed looking nicer. Maybe I should capture both. I can remove the comment if you don't like it, but I see scenarios where it would be more efficient.

I'll leave this code reference as an example where accepting the collection, then passing this original collection around by reference, decorating it with additional contents, stripping unneded fields, etc. would be more efficient than trying to construct a new object (or a copy of an object) from initializer list. I think I understand your concern that I should've elaborated on exactly what scenarios would (in future) may benefit from passing in the original collection. This should work great for a header-only library, as we later on may completely bypass the copy/iteration/creation of a new object from initializer list, operating on original customer-code created object.

Structure-wise, both initializations look similar: initializer list as an argument to template, or a reference to container that's been already populated in customer code.

We mostly follow this pattern in this "other" library elsewhere:
https://github.com/microsoft/cpp_client_telemetry/blob/master/lib/api/Logger.cpp
And the practice of passing around the container prepared in customer code, e.g. literally passing around a modifyable map of variants, - has been working well. Especially when the original collection passed to SDK needs additional "decoration".

Here is another scenario: let's say your collection does not change and contains something like appName=x,status=Failed,scenario=FileOpen. You'd prepare it once in a static initializer. Then you pass it as a reference to Properties to AddEvent, never destroying it, thus - never needing to construct/reconstruct it. Appears like in the other case (if you used the initializer list instead of passing collection by reference), you'd end up constructing it from Key-Value Iterable every time? It appears like construct-once-then-reuse will be more optimal then, at least for a statically linked code where there is no passing across ABI boundary needed, no key-value iterable transform. I can add some benchmarks to measure those scenarios in my other (code) PR.

Can we add the specific use-case/scenario as an example in this doc or in unit-test ( and refer it here ) to justify the performance statement. As the current api/sdk implementation, the only place where memcpy MAY happen is during Recordable::AddEvent() execution, but that would depend how exporter implements this logic.

the only place where memcpy MAY happen is during Recordable::AddEvent() execution

Well, that is the most commonly invoked function, actually. In ETW use case - it is entirely header-only, and it entirely avoids transform to Recordable upfront. The idea is that Properties may not even need to be transformed to Recordable. Because it stays valid as a container - for the duration of invocation of synchronous AddEvent call, that immediately performs serialization of event from original container + decoration into either ETW/TraceLogging (binary packed in standard TraceLogging format) or ETW/MessagePack. That (transform + IPC) is done right away on user calling thread with no background batching.

In case of other Key-Value iterable container, e.g. in case of a map -- there is a need to iterate for each key-value (as you mentioned, in case of batching exporter and `Recordable). I'll illustrate the difference between when the memcpy transform is unavoidable, and where the original container may be reused.

I'm pushing an update to my code PR to explain this further ( +1 benchmark, yay! ), but that won't change the statement made here.

lalitb · 2021-03-25T17:24:18Z

exporters/etw/README.md

+| HAVE_ABSEIL_VARIANT | Use `absl::variant` instead of `std::variant`                                      |
+| HAVE_TLD      | Use ETW/TraceLogging Dynamic protocol. This is the default implementation.                       |
+| HAVE_CSTRING_TYPE   | Allow passing C-string type `const char *` to API calls.                                 |
+


Nicely summarized the build options. Few of the options ( HAVE_ABSEIL_VARIANT, HAVE_CPP_STDLIB, HAVE_ABSEIL_VARIANT ) are applicable as build instructions for opentelemetry-cpp, do you think we can move them there ( https://github.com/open-telemetry/opentelemetry-cpp/blob/main/INSTALL.md#building-as-standalone-cmake-project ), give it's reference and keep rest of them here?

Yes, but I want to have a separate document created that covers ALL options. Maybe a separate section in INSTALL.md as you are suggesting. For this one - I tried to keep it scoped to what is relevant in this concrete usage scenario.

Besides the description for the options, could we add an extra column to the the pros and even cons of using each option to help users to make decision?

I like the idea of elaborating on what this option is doing. But maybe in that same description column.

I added a few more sentences. HAVE_CPP_STDLIB has its own separate detailed design doc here:
https://github.com/open-telemetry/opentelemetry-cpp/blob/main/docs/building-with-stdlib.md

lalitb · 2021-03-25T17:40:34Z

exporters/etw/README.md

+  // Each Tracer is associated with unique TraceId.
+  auto tracer = tp.GetTracer(providerName, "TLD");
+
+  // Properties is a helper class in ETW namespace that is otherwise compatible


Sorry if I am missing something, but where is this helper class Properties defined in ETW namespace ? Had a quick look to the code, but didn't find it there ?

Sorry, I might have missed the using aliases. I'll refresh it. It's presently defined here, in a PR which contains the code. I wanted to scope this PR to documentation only. This is the code:

opentelemetry-cpp/exporters/etw/test/etw_tracer_test.cc

Line 26 in 48991c0

using Properties = opentelemetry::exporter::ETW::Properties;

using Properties = opentelemetry::exporter::ETW::Properties; using PropertyValue = opentelemetry::exporter::ETW::PropertyValue; using PropertyValueMap = opentelemetry::exporter::ETW::PropertyValueMap;

Going forward I want to benchmark the pros and cons of passing a container vs. initializer list, the topic we discussed with Johannes above. In general, the concept is not necessarily unique to ETW. It could be applied to other exporters. Maybe I'd propose a refactor of pulling it in a separate (optional) helper class on API surface. Users may still use the existing initializer lists as well, as this does not break the "API". There is a moment where we have to be careful about "ABI". But since the implementation is header-only and linked straight into the app, with the matching runtime, the "ABI" compatibility issue is not applicable in this scenario. I can elaborate on this offline.

I also want to make sure that even if we pass in a reference to map, if there is a need to involve a transform to API call across the ABI boundary, then the container itself is KeyValueIterable-interface compatible. Means the SDK code may iterate over that one in ABI safe manner, and perform a copy of all key-value pairs to the actual container beyond the ABI boundary. This will only be needed for other (non-ETW) exporters, and ETW exporter is lucky in this regard. As it's currently done fully as header-only, slim, very few hops to the actual "Event Write" routine, with no shared library. Maybe I can contribute an example that can fork the incoming event properties container (Event) to both - ETW and some other destination, such as Standard Output exporter for illustrative purposes (in contrib repo). Basically, an example that shows how to forward that container, dual-home to two TraceProvider. An example of a MetaTraceProvider, that aggregates within itself two different OpenTelemetry provider implementations.

lalitb

LGTM - perhaps we can add scenario for no-memcpy as that would be helpful.

maxgolov · 2021-03-29T19:30:38Z

@pyohannes - I will add a separate benchmark to measure, and try to supplement the document (in a separate PR) with concrete examples where initializer list-supplied data is better, and where prepared container (map) is better.

exporters/etw/README.md

…-cpp into maxgolov/etw_howto

maxgolov added 2 commits March 24, 2021 14:06

Turn off MDL length limit

bd28bb0

Add ETW exporter instrumentation guidance

fbd8142

maxgolov requested a review from a team March 25, 2021 01:15

Fix formatting nit

c7111e0

maxgolov changed the title ~~Maxgolov/etw howto~~ ETW exporter how-to - usage instructions document Mar 25, 2021

pyohannes reviewed Mar 25, 2021

View reviewed changes

lalitb reviewed Mar 25, 2021

View reviewed changes

lalitb approved these changes Mar 29, 2021

View reviewed changes

ThomsonTan reviewed Mar 29, 2021

View reviewed changes

exporters/etw/README.md Show resolved Hide resolved

ThomsonTan reviewed Mar 29, 2021

View reviewed changes

exporters/etw/README.md Outdated Show resolved Hide resolved

ThomsonTan reviewed Mar 29, 2021

View reviewed changes

exporters/etw/README.md Outdated Show resolved Hide resolved

exporters/etw/README.md Show resolved Hide resolved

maxgolov added 2 commits March 29, 2021 19:21

Merge branch 'main' of http://github.com/open-telemetry/opentelemetry…

3616cfc

…-cpp into maxgolov/etw_howto

Addressing code review comments

fb4f5d3

ThomsonTan approved these changes Mar 30, 2021

View reviewed changes

Merge branch 'main' into maxgolov/etw_howto

16e94a1

lalitb merged commit 6337225 into main Mar 30, 2021

lalitb deleted the maxgolov/etw_howto branch March 30, 2021 10:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ETW exporter how-to - usage instructions document #628

ETW exporter how-to - usage instructions document #628

maxgolov commented Mar 25, 2021 •

edited

Loading

codecov bot commented Mar 25, 2021 •

edited

Loading

pyohannes Mar 25, 2021

maxgolov Mar 25, 2021 •

edited

Loading

maxgolov Mar 25, 2021 •

edited

Loading

maxgolov Mar 25, 2021 •

edited

Loading

lalitb Mar 26, 2021 •

edited

Loading

maxgolov Mar 27, 2021 •

edited

Loading

maxgolov Mar 27, 2021 •

edited

Loading

lalitb Mar 25, 2021

maxgolov Mar 25, 2021

ThomsonTan Mar 29, 2021

maxgolov Mar 30, 2021

maxgolov Mar 30, 2021

lalitb Mar 25, 2021

maxgolov Mar 25, 2021 •

edited

Loading

maxgolov Mar 25, 2021 •

edited

Loading

lalitb left a comment

maxgolov commented Mar 29, 2021

ETW exporter how-to - usage instructions document #628

ETW exporter how-to - usage instructions document #628

Conversation

maxgolov commented Mar 25, 2021 • edited Loading

Documentation

codecov bot commented Mar 25, 2021 • edited Loading

Codecov Report

pyohannes Mar 25, 2021

Choose a reason for hiding this comment

maxgolov Mar 25, 2021 • edited Loading

Choose a reason for hiding this comment

maxgolov Mar 25, 2021 • edited Loading

Choose a reason for hiding this comment

maxgolov Mar 25, 2021 • edited Loading

Choose a reason for hiding this comment

lalitb Mar 26, 2021 • edited Loading

Choose a reason for hiding this comment

maxgolov Mar 27, 2021 • edited Loading

Choose a reason for hiding this comment

maxgolov Mar 27, 2021 • edited Loading

Choose a reason for hiding this comment

lalitb Mar 25, 2021

Choose a reason for hiding this comment

maxgolov Mar 25, 2021

Choose a reason for hiding this comment

ThomsonTan Mar 29, 2021

Choose a reason for hiding this comment

maxgolov Mar 30, 2021

Choose a reason for hiding this comment

maxgolov Mar 30, 2021

Choose a reason for hiding this comment

lalitb Mar 25, 2021

Choose a reason for hiding this comment

maxgolov Mar 25, 2021 • edited Loading

Choose a reason for hiding this comment

maxgolov Mar 25, 2021 • edited Loading

Choose a reason for hiding this comment

lalitb left a comment

Choose a reason for hiding this comment

maxgolov commented Mar 29, 2021

maxgolov commented Mar 25, 2021 •

edited

Loading

codecov bot commented Mar 25, 2021 •

edited

Loading

maxgolov Mar 25, 2021 •

edited

Loading

maxgolov Mar 25, 2021 •

edited

Loading

maxgolov Mar 25, 2021 •

edited

Loading

lalitb Mar 26, 2021 •

edited

Loading

maxgolov Mar 27, 2021 •

edited

Loading

maxgolov Mar 27, 2021 •

edited

Loading

maxgolov Mar 25, 2021 •

edited

Loading

maxgolov Mar 25, 2021 •

edited

Loading