Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

shared use of node-fetch by app and resource-detector-gcp can result in https module not getting instrumented #2440

Open
sjparkinson opened this issue Sep 20, 2024 · 4 comments
Labels
bug Something isn't working priority:p2 Bugs and spec inconsistencies which cause telemetry to be incomplete or incorrect

Comments

@sjparkinson
Copy link

What version of OpenTelemetry are you using?

{
    "@opentelemetry/api": "^1.9.0",
    "@opentelemetry/auto-instrumentations-node": "^0.50.0",
    "dotenv": "^16.4.5",
    "express": "^4.21.0",
    "node-fetch": "^2.7.0"
}

What version of Node are you using?

v20.16.0

What did you do?

Tried to get auto-instrumented metrics out of uses of node-fetch@v2.

Basic express app setup that doesn't work:

https://gist.github.com/sjparkinson/3cd2061c1c8c7257787f15a66b5e574c/12075e305e88b53ae1556a5ea5b4cdcb96810b04

What did you expect to see?

Traces and metrics from instrumentation-http for uses of node-fetch.

What did you see instead?

Only traces from instrumentation-net for uses of node-fetch.

Additional context

As reported by Trent in the CNCF Slack, it looks to be an issue with resource-detector-gcp requiring gaxios which requires the two Node.js http/https modules before they are instrumented.

  1. --require @opentelemetry/auto-instrumentations-node/register loads auto-instrumentations-node, which indirectly loads gaxios, which has require('node-fetch'). node-fetch requires http and https.
  2. Then later, the OTel HTTP instrumentation is initialized: adding the hooks so that when http and https are next required, they will be instrumented.
  3. Then your app.js starts. When it runs require('node-fetch') it gets the cached module from require.cache, so there is no require('https') that results from that call.
  4. Notably express does not require('https'). It does require('http').

So, require('https') does not happen after the OTel SDK is started and hence it is not
instrumented.

@sjparkinson sjparkinson added the bug Something isn't working label Sep 20, 2024
@trentm trentm changed the title Node.js https module is not auto-instrumented shared use of node-fetch by app and resource-detector-gcp can result in https module not getting instrumented Sep 20, 2024
@trentm
Copy link
Contributor

trentm commented Sep 20, 2024

Re-stating the issue. The application use OTel via:

node --require @opentelemetry/auto-instrumentations-node/register app.js

The application is using node-fetch to make an https call:

const fetch = require('node-fetch');
// ...
    fetch('https://httpbin.org/post', ...)

Problem: there is no HTTP client span for the outgoing https://httpbin.org/post request.

the issue

The require-in-the-middle hook to instrument https does not get called, so the use of https.request(...) in the node-fetch module isn't instrumented.

Both OTel and the app are using the same node_modules/node-fetch installation:

% npm ls node-fetch
test-otel@1.0.0 /Users/trentm/tmp/oteljs-node-fetch-q
├─┬ @opentelemetry/auto-instrumentations-node@0.50.0
│ └─┬ @opentelemetry/resource-detector-gcp@0.29.11
│   └─┬ gcp-metadata@6.1.0
│     └─┬ gaxios@6.7.1
│       └── node-fetch@2.7.0 deduped
└── node-fetch@2.7.0

The node-fetch module gets required (and cached) before the HttpInstrumentation is initialized, because it is loaded by @opentelemetry/auto-instrumentations-node. If nothing else in the application code does require('https'), then the require-in-the-middle hook doesn't get called.

A workaround is for the application code manually require('https') to trigger the require-in-the-middle hook, but that is too obtuse.

The general issue. Assuming OTel bootstrapping (e.g. via auto-instrumentations-node/register or otherwise) is done first (via --require or --import), then the general issue is: If any of the OTel bootstrap code uses a module (like node-fetch) that (a) indirectly uses a module (like https in this case) that will be instrumented and (b) is also used by the application, then there is a potential for that instrumention hook to not happen.

Practically speaking, I don't expect there to be any or many other occurrences of this.

fix options

I'm discussing options for fixing the specific issue with node-fetch, not the "general issue".

  1. Explicitly have instrumentation-http trigger the hook for http and https modules when it is initialized.
  2. Re-write resource-detector-gcp to not use gcp-metadata -> gaxios -> node-fetch. Perhaps feels a little extreme, but FWIW, I vaguely recall that the detector isn't that involved.
  3. Somehow ensure resource-detector-gcp uses a private install of node-fetch, so there isn't a shared usage with application code. One way to do this would be to bundle gcp-metadata, e.g. with esbuild, and install that.

... or other options I haven't thought of.

@trentm
Copy link
Contributor

trentm commented Sep 25, 2024

FYI: discussed this a bit on the OTel JS SIG today: https://docs.google.com/document/d/1tCyoQK49WVcE-x8oryZOTTToFm7sIeUhxFPm9g-qL1k/edit#heading=h.rg479jqq3jo5
No obvious option came out. It is a bit of a hard problem.

@JamieDanielson
Copy link
Member

  • Explicitly have instrumentation-http trigger the hook for http and https modules when it is initialized.
  • Re-write resource-detector-gcp to not use gcp-metadata -> gaxios -> node-fetch. Perhaps feels a little extreme, but FWIW, I vaguely recall that the detector isn't that involved.
  • Somehow ensure resource-detector-gcp uses a private install of node-fetch, so there isn't a shared usage with application code. One way to do this would be to bundle gcp-metadata, e.g. with esbuild, and install that.

@trentm beat me to it with the comment but clicking comment anyway for posterity:
Changing gcp detector is possible, but the same problem can come up in other detectors including custom detectors.

@dyladan dyladan added the priority:p2 Bugs and spec inconsistencies which cause telemetry to be incomplete or incorrect label Sep 25, 2024
@david-luna
Copy link
Contributor

... or other options I haven't thought of.

If I'm not wrong we had a similar situation with grpc and solved it by doing lazy load of the modules.
ref: open-telemetry/opentelemetry-js#4432

node-fetch is only needed when detect process starts so we could require the module within the function. At that moment the instrumentation is already instantiated and the hook will work as expected. Maybe this (lazy load of modules) could be a rule of thumb for detectors and similar components.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working priority:p2 Bugs and spec inconsistencies which cause telemetry to be incomplete or incorrect
Projects
None yet
Development

No branches or pull requests

5 participants