Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ETW exporter how-to - usage instructions document #628

Merged
merged 6 commits into from
Mar 30, 2021
Merged

Conversation

maxgolov
Copy link
Contributor

@maxgolov maxgolov commented Mar 25, 2021

Related to #326

Documentation

This is a documentation that explains how to use OpenTelemetry C++ SDK with ETW exporter on Windows. The main idea is that application is instrumented using OpenTelemetry in a platform-agnostic way . And may subsequently utilize ETW exporter, if that application is running on Windows OS. I included a few examples that allow to utilize vendor-agnostic flow that is not tied to any Microsoft offering or service. This could be achieved in many various ways, but I am referring to Microsoft.Azure.EventFow as an example of a side-car that may listen to events emitted over ETW channel. The other alternate project that may be utilized as a side-car for this purpose is SilkETW.

There are two operational modes in ETW exporter:

  • ETW / TraceLoggingDynamic mode - so called Dynamic Manifest, already supported by any ETW Listener, best supported in C# / .NET. But it requires a header that we are working on releasing as opensource later this year. The closest analog to what it is doing is available as GoLang implementation here.

  • ETW / MsgPack mode - this one is entirely opensource at this point. And the usage example that I added in the document shows how it can be used today, with an out-of-proc listener that is capable of receiving ETW events, decoding their MsgPack-encoded payload to structured objects, then subsequently forwarding those objects to any destination of their choice.

I also added a mini TODO list in the Addendum section, to elaborate on further evolution plans, specifically on how OpenTelemetry C++ API calls in native code can be expressed as ETW events, that may subsequently be processed out-of-proc and forwarded to OpenTelemetry.NET, for example. That way preserving the uninterrupted flow of OpenTelemetry events further to OpenTelemetry-compatible destination, merely using ETW as a IPC channel for data passing on Windows. Because it is robust, efficient and already optimized for that purpose in the most optimal way imaginable.

There are no code changes, just the document. All code changes go in a separate PR here #519 (there are still a few moments that I'd like to finish in code before re-releasing the code for review).

Corresponding E2E example, with sample code, and ETW listener using EventFlow will be added separately to C++ contrib repo. We can discuss what prominent apps and services running on Windows (Docker, Apache, IIS) may be instrumented using that flow. Ideally I would envision the Docker replacing their ETW Logging Driver with OpenTelemetry. We can also discuss how similar approach, concepts, and structure may be applied in the implementation of a similar exporter in GoLang for Windows using OpenTelemetry Go SDK and go-winio package (in a separate PR, and in a separate repo). But many concepts shown here in C++ may be universally applicable to other languages in general.

@maxgolov maxgolov requested a review from a team March 25, 2021 01:15
@maxgolov maxgolov changed the title Maxgolov/etw howto ETW exporter how-to - usage instructions document Mar 25, 2021
@codecov
Copy link

codecov bot commented Mar 25, 2021

Codecov Report

Merging #628 (16e94a1) into main (62c6b13) will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##             main     #628   +/-   ##
=======================================
  Coverage   94.49%   94.49%           
=======================================
  Files         197      197           
  Lines        9130     9130           
=======================================
  Hits         8627     8627           
  Misses        503      503           

Comment on lines +65 to +67
// Properties is a helper class in ETW namespace that is otherwise compatible
// with Key-Value Iterable accepted by OpenTelemetry API. Using Properties
// should enable more efficient data transfer without unnecessary memcpy.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a bit misleading in the context of this example. As far as I can see, none of the usages of Properties here is more efficient than what the API provides out of the box. There's no memcpy involved in this case:

span->AddEvent(eventName, {
    {"uint32Key", (uint32_t)1234},
    {"uint64Key", (uint64_t)1234567890},
    {"strKey", "someValue"}
});

Copy link
Contributor Author

@maxgolov maxgolov Mar 25, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The answer - really depends. When we need to decorate the properties with additional attributes, such as - passing non-const properties, append some common properties... Maybe a set of common properties unique to custom TracerProvider . For example - some sort of custom correlation technique that needs to be appended on every record. Let's say, some custom field called cV and stamped on every Properties object passed in; OR appending a group of other device.* or app.* fields on top.. Then we could have reused the original container that was created in the customer code once, then passing it around as non-const by reference on that same thread. Thus, not requiring a transform to Recordable , and not needing the key-value iteration on it. And not needing to construct it in the SDK from initializer list as an argument...

I think I should have moved this comment from Attributes down to event Properties below, to make a stronger case 😄 Because with Attributes, it appears that what you are suggesting is indeed looking nicer. Maybe I should capture both. I can remove the comment if you don't like it, but I see scenarios where it would be more efficient.

Copy link
Contributor Author

@maxgolov maxgolov Mar 25, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll leave this code reference as an example where accepting the collection, then passing this original collection around by reference, decorating it with additional contents, stripping unneded fields, etc. would be more efficient than trying to construct a new object (or a copy of an object) from initializer list. I think I understand your concern that I should've elaborated on exactly what scenarios would (in future) may benefit from passing in the original collection. This should work great for a header-only library, as we later on may completely bypass the copy/iteration/creation of a new object from initializer list, operating on original customer-code created object.

Structure-wise, both initializations look similar: initializer list as an argument to template, or a reference to container that's been already populated in customer code.

We mostly follow this pattern in this "other" library elsewhere:
https://github.com/microsoft/cpp_client_telemetry/blob/master/lib/api/Logger.cpp
And the practice of passing around the container prepared in customer code, e.g. literally passing around a modifyable map of variants, - has been working well. Especially when the original collection passed to SDK needs additional "decoration".

Copy link
Contributor Author

@maxgolov maxgolov Mar 25, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is another scenario: let's say your collection does not change and contains something like appName=x,status=Failed,scenario=FileOpen. You'd prepare it once in a static initializer. Then you pass it as a reference to Properties to AddEvent, never destroying it, thus - never needing to construct/reconstruct it. Appears like in the other case (if you used the initializer list instead of passing collection by reference), you'd end up constructing it from Key-Value Iterable every time? It appears like construct-once-then-reuse will be more optimal then, at least for a statically linked code where there is no passing across ABI boundary needed, no key-value iterable transform. I can add some benchmarks to measure those scenarios in my other (code) PR.

Copy link
Member

@lalitb lalitb Mar 26, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add the specific use-case/scenario as an example in this doc or in unit-test ( and refer it here ) to justify the performance statement. As the current api/sdk implementation, the only place where memcpy MAY happen is during Recordable::AddEvent() execution, but that would depend how exporter implements this logic.

Copy link
Contributor Author

@maxgolov maxgolov Mar 27, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the only place where memcpy MAY happen is during Recordable::AddEvent() execution

Well, that is the most commonly invoked function, actually. In ETW use case - it is entirely header-only, and it entirely avoids transform to Recordable upfront. The idea is that Properties may not even need to be transformed to Recordable. Because it stays valid as a container - for the duration of invocation of synchronous AddEvent call, that immediately performs serialization of event from original container + decoration into either ETW/TraceLogging (binary packed in standard TraceLogging format) or ETW/MessagePack. That (transform + IPC) is done right away on user calling thread with no background batching.

In case of other Key-Value iterable container, e.g. in case of a map -- there is a need to iterate for each key-value (as you mentioned, in case of batching exporter and `Recordable). I'll illustrate the difference between when the memcpy transform is unavoidable, and where the original container may be reused.

Copy link
Contributor Author

@maxgolov maxgolov Mar 27, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm pushing an update to my code PR to explain this further ( +1 benchmark, yay! ), but that won't change the statement made here.

| HAVE_ABSEIL_VARIANT | Use `absl::variant` instead of `std::variant` |
| HAVE_TLD | Use ETW/TraceLogging Dynamic protocol. This is the default implementation. |
| HAVE_CSTRING_TYPE | Allow passing C-string type `const char *` to API calls. |

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nicely summarized the build options. Few of the options ( HAVE_ABSEIL_VARIANT, HAVE_CPP_STDLIB, HAVE_ABSEIL_VARIANT ) are applicable as build instructions for opentelemetry-cpp, do you think we can move them there ( https://github.com/open-telemetry/opentelemetry-cpp/blob/main/INSTALL.md#building-as-standalone-cmake-project ), give it's reference and keep rest of them here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but I want to have a separate document created that covers ALL options. Maybe a separate section in INSTALL.md as you are suggesting. For this one - I tried to keep it scoped to what is relevant in this concrete usage scenario.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Besides the description for the options, could we add an extra column to the the pros and even cons of using each option to help users to make decision?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea of elaborating on what this option is doing. But maybe in that same description column.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a few more sentences. HAVE_CPP_STDLIB has its own separate detailed design doc here:
https://github.com/open-telemetry/opentelemetry-cpp/blob/main/docs/building-with-stdlib.md

// Each Tracer is associated with unique TraceId.
auto tracer = tp.GetTracer(providerName, "TLD");

// Properties is a helper class in ETW namespace that is otherwise compatible
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry if I am missing something, but where is this helper class Properties defined in ETW namespace ? Had a quick look to the code, but didn't find it there ?

Copy link
Contributor Author

@maxgolov maxgolov Mar 25, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I might have missed the using aliases. I'll refresh it. It's presently defined here, in a PR which contains the code. I wanted to scope this PR to documentation only. This is the code:

using Properties = opentelemetry::exporter::ETW::Properties;

using Properties       = opentelemetry::exporter::ETW::Properties;
using PropertyValue    = opentelemetry::exporter::ETW::PropertyValue;
using PropertyValueMap = opentelemetry::exporter::ETW::PropertyValueMap;

Going forward I want to benchmark the pros and cons of passing a container vs. initializer list, the topic we discussed with Johannes above. In general, the concept is not necessarily unique to ETW. It could be applied to other exporters. Maybe I'd propose a refactor of pulling it in a separate (optional) helper class on API surface. Users may still use the existing initializer lists as well, as this does not break the "API". There is a moment where we have to be careful about "ABI". But since the implementation is header-only and linked straight into the app, with the matching runtime, the "ABI" compatibility issue is not applicable in this scenario. I can elaborate on this offline.

Copy link
Contributor Author

@maxgolov maxgolov Mar 25, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also want to make sure that even if we pass in a reference to map, if there is a need to involve a transform to API call across the ABI boundary, then the container itself is KeyValueIterable-interface compatible. Means the SDK code may iterate over that one in ABI safe manner, and perform a copy of all key-value pairs to the actual container beyond the ABI boundary. This will only be needed for other (non-ETW) exporters, and ETW exporter is lucky in this regard. As it's currently done fully as header-only, slim, very few hops to the actual "Event Write" routine, with no shared library. Maybe I can contribute an example that can fork the incoming event properties container (Event) to both - ETW and some other destination, such as Standard Output exporter for illustrative purposes (in contrib repo). Basically, an example that shows how to forward that container, dual-home to two TraceProvider. An example of a MetaTraceProvider, that aggregates within itself two different OpenTelemetry provider implementations.

Copy link
Member

@lalitb lalitb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - perhaps we can add scenario for no-memcpy as that would be helpful.

@maxgolov
Copy link
Contributor Author

@pyohannes - I will add a separate benchmark to measure, and try to supplement the document (in a separate PR) with concrete examples where initializer list-supplied data is better, and where prepared container (map) is better.

exporters/etw/README.md Outdated Show resolved Hide resolved
exporters/etw/README.md Outdated Show resolved Hide resolved
exporters/etw/README.md Show resolved Hide resolved
@lalitb lalitb merged commit 6337225 into main Mar 30, 2021
@lalitb lalitb deleted the maxgolov/etw_howto branch March 30, 2021 10:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants