Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix ASGIGetter keys to fetch from actual carrier headers #1435

Merged
merged 14 commits into from
Jan 12, 2023
Merged

Fix ASGIGetter keys to fetch from actual carrier headers #1435

merged 14 commits into from
Jan 12, 2023

Conversation

hasier
Copy link
Contributor

@hasier hasier commented Nov 10, 2022

Description

Working in ASGI with any propagator using getter.keys(), e.g. the GCP propagator when no header is present in a request, it breaks with the below message:

Traceback (most recent call last):
  File "/root/.cache/pypoetry/virtualenvs/api-9TtSrW0h-py3.10/lib/python3.10/site-packages/uvicorn/protocols/http/httptools_impl.py", line 404, in run_asgi
    result = await app(  # type: ignore[func-returns-value]
  File "/root/.cache/pypoetry/virtualenvs/api-9TtSrW0h-py3.10/lib/python3.10/site-packages/uvicorn/middleware/proxy_headers.py", line 78, in __call__
    return await self.app(scope, receive, send)
  File "/root/.cache/pypoetry/virtualenvs/api-9TtSrW0h-py3.10/lib/python3.10/site-packages/fastapi/applications.py", line 270, in __call__
    await super().__call__(scope, receive, send)
  File "/root/.cache/pypoetry/virtualenvs/api-9TtSrW0h-py3.10/lib/python3.10/site-packages/starlette/applications.py", line 124, in __call__
    await self.middleware_stack(scope, receive, send)
  File "/root/.cache/pypoetry/virtualenvs/api-9TtSrW0h-py3.10/lib/python3.10/site-packages/starlette/middleware/errors.py", line 184, in __call__
    raise exc
  File "/root/.cache/pypoetry/virtualenvs/api-9TtSrW0h-py3.10/lib/python3.10/site-packages/starlette/middleware/errors.py", line 162, in __call__
    await self.app(scope, receive, _send)
  File "/root/.cache/pypoetry/virtualenvs/api-9TtSrW0h-py3.10/lib/python3.10/site-packages/opentelemetry/instrumentation/asgi/__init__.py", line 531, in __call__
    span, token = _start_internal_or_server_span(
  File "/root/.cache/pypoetry/virtualenvs/api-9TtSrW0h-py3.10/lib/python3.10/site-packages/opentelemetry/instrumentation/utils.py", line 111, in _start_internal_or_server_span
    ctx = extract(context_carrier, getter=context_getter)
  File "/root/.cache/pypoetry/virtualenvs/api-9TtSrW0h-py3.10/lib/python3.10/site-packages/opentelemetry/propagate/__init__.py", line 102, in extract
    return get_global_textmap().extract(carrier, context, getter=getter)
  File "/root/.cache/pypoetry/virtualenvs/api-9TtSrW0h-py3.10/lib/python3.10/site-packages/opentelemetry/propagators/composite.py", line 52, in extract
    context = propagator.extract(carrier, context, getter=getter)
  File "/root/.cache/pypoetry/virtualenvs/api-9TtSrW0h-py3.10/lib/python3.10/site-packages/opentelemetry/propagators/cloud_trace_propagator/__init__.py", line 94, in extract
    header = self._get_header_value(getter, carrier)
  File "/root/.cache/pypoetry/virtualenvs/api-9TtSrW0h-py3.10/lib/python3.10/site-packages/opentelemetry/propagators/cloud_trace_propagator/__init__.py", line 78, in _get_header_value
    for key in getter.keys(carrier):
  File "/root/.cache/pypoetry/virtualenvs/api-9TtSrW0h-py3.10/lib/python3.10/site-packages/opentelemetry/instrumentation/asgi/__init__.py", line 263, in keys
    return [_key.decode("utf8") for (_key, _value) in carrier]
  File "/root/.cache/pypoetry/virtualenvs/api-9TtSrW0h-py3.10/lib/python3.10/site-packages/opentelemetry/instrumentation/asgi/__init__.py", line 263, in <listcomp>
    return [_key.decode("utf8") for (_key, _value) in carrier]
ValueError: too many values to unpack (expected 2)

In my case, this seems to be triggered by the fact that the GCP propagator uses getter.keys() as a fallback. This keys() function in the ASGIGetter seems to be taking the header values from an incorrect place, and therefore provoking the crash. I found it with the GCP example, but it's probably affecting others, as the issue lays with the definition of ASGIGetter.keys() here.

Fixes #1487

Type of change

  • Bug fix (non-breaking change which fixes an issue)

How Has This Been Tested?

Please describe the tests that you ran to verify your changes. Provide instructions so we can reproduce. Please also list any relevant details for your test configuration

  • Using any propagator that makes use of getter.keys() on an ASGI context makes the request fail. It's easy to reproduce with the GCP CloudTraceFormatPropagator when the request does not have a x-cloud-trace-context header.
  • The added test changes make tests fail if the fix is not included.

NOTE: I've re-created opentelemetry.test.mock_textmap.MockTextMapPropagator with a tiny modification to use keys() from the getter. I understand it might be interesting to add these changes to the original testing repo as well, as it ensures all keys that .get() returns are in .keys() as well, but thought it'd be interesting to have this fix out soon and then potentially port it there if it seemed useful/appropriate to y'all.

Does This PR Require a Core Repo Change?

  • Yes. - Link to PR:
  • No.

Checklist:

See contributing.md for styleguide, changelog guidelines, and more.

  • Followed the style guidelines of this project
  • Changelogs have been updated
  • Unit tests have been added
  • Documentation has been updated

@hasier hasier requested a review from a team November 10, 2022 13:10
@linux-foundation-easycla
Copy link

linux-foundation-easycla bot commented Nov 10, 2022

CLA Signed

The committers listed above are authorized under a signed CLA.

@bluesliverx
Copy link

bluesliverx commented Nov 15, 2022

We just hit this same problem, not in GCP, and it's forcing us to downgrade to 0.34b0. We applied this patch manually to 0.35b0 and it also worked for us.

@ocelotl
Copy link
Contributor

ocelotl commented Nov 21, 2022

gcp/src/opentelemetry/propagators/cloud_trace_propagator/init.py#L78). This all() call in the ASGIGetter seems to be taking the header values from an incorrect place, and therefore provoking the crash.

Sorry, which all() call?

@hasier
Copy link
Contributor Author

hasier commented Nov 21, 2022

gcp/src/opentelemetry/propagators/cloud_trace_propagator/init.py#L78). This all() call in the ASGIGetter seems to be taking the header values from an incorrect place, and therefore provoking the crash.

Sorry, which all() call?

Ah sorry @ocelotl, I meant the keys() function, just updated the description 🙏

# Conflicts:
#	CHANGELOG.md
# Conflicts:
#	CHANGELOG.md
@hasier
Copy link
Contributor Author

hasier commented Nov 28, 2022

@ocelotl is this PR something you can help with reviewing? Or maybe @rogersd @srikanthccv, as I've seen you working in the ASGI area recently? Sorry for the wide ping, it just feels to me like an easy fix for a bug that can become quite common and important.

@@ -260,7 +260,8 @@ def get(
return decoded

def keys(self, carrier: dict) -> typing.List[str]:
return [_key.decode("utf8") for (_key, _value) in carrier]
headers = carrier.get("headers") or []
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just curious, if "headers" is not in carrier.keys, shouldn't this method return [_key.decode("utf8") for (_key, _value) in carrier]. The way this is being modified makes this method return an empty list if "headers" is not in carrier.keys.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Considering the .get() method above returns header-specific fields and its description mentions "header name in scope", I assumed here we'd only want to return headers rather than the whole carrier, which in this case is the ASGI scope. I'm not too sure fetching all values from the scope is correct though, there's a bunch of ASGI-specific info there and I wouldn't expect to see (nor inject) any extra info from Opentelemetry.

Now that you mention it though, I guess some people may be relying on this behaviour now, so maybe we should join both lists? Or would we treat this as a breaking change instead?

Copy link

@torqataNate torqataNate Dec 17, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would argue that this should be keyed to headers, but we do need to be conscious of other formats. For example as we look through most proprogators, the common use is for the carrier to only contain header keys. The most relatable part of the spec docs are here.
https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/context/api-propagators.md#:~:text=In%20order%20to%20increase%20compatibility%2C
The carrier of propagated data on both the client (injector) and server (extractor) side is usually an HTTP request. In order to increase compatibility, the key/value pairs MUST only consist of US-ASCII characters that make up valid HTTP header fields as per [RFC 7230](https://tools.ietf.org/html/rfc7230#section-3.2).

The http request will have other keys with it such as client, method, query params etc which we dont care about, we only need the headers. It is only the TextMapPropagator section that states the carrier is usually a http request though even though that does seem to be the standard to supply the http payload as a carrier. As we look at the spec on proprogators, It actually doesn't specify whats in a carrier. Which is here https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/context/api-propagators.md#carrier

In the branch i was going to make a pull request against i searched for the headers key first and iderated over that only, and if that did not exist, I chose the carrier payload as it should then be a header.

The argument could be made that we need to look through the entire carrier payload for headers, as the format of carrier has not been specified. But it feels burdensome for this method to have to deal with that.

Lastly
The above pull request would also fail on line 264 when a header isn't a byte type so it is still broken. I had a pull request i was going to submit but im not a contributor. how would you like me to submit the proposed change?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as this processor is concerned, it has been working until this change. The previous behaviour was not correct either, but at least it would not crash. The changes in this PR directly address the main issue, which is the crash due to iterating the wrong value, and fetching the correct value the same way we do in .get(). I would argue that any more changes in the behaviour than that are out of scope and probably require a separate discussion and investigation. In any case, something for the maintainers to decide CC @ocelotl @srikanthccv

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a correct fix both for the crash and behaviour because the web framework instrumentation may receive carriers of different types, but their goal is to inject/extract the trace context information into the HTTP header, which is used for propagation. And since the ASGI specification says the headers is Iterable[[byte string, byte string]] I think it's safe to assume keys will be a byte string so it's not broken.

# Conflicts:
#	CHANGELOG.md
@hasier
Copy link
Contributor Author

hasier commented Jan 10, 2023

@ocelotl @rogersd @srikanthccv @shalevr sorry for the wide ping again, just wanted to give another nudge as it's been a few weeks since I first opened the PR. A review with improvements would be appreciated in order to get this closer to approval. Thanks.

@srikanthccv
Copy link
Member

I will take a look at it.

@srikanthccv srikanthccv self-assigned this Jan 10, 2023
set_global_textmap(MockTextMapPropagator())

def tearDown(self):
set_global_textmap(self.previous_propagator)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I couldn't follow what the tests were targeting. I would test that ASGIGetter instance doesn't throw an exception for scope dict, which is currently failing, and additionally assert that keys return the Headers keys.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea was to ensure that the values returned by get() and keys() were the same and they were consistent across different tests, but it was a bit of a hack indeed with that custom propagator. I added simpler tests in test_getter.py instead that indeed fail before the fix and pass now, hope they make sense c219bb0

# Conflicts:
#	CHANGELOG.md
@cipherself
Copy link

Assuming the current github workflow checks pass and the PR is merged, when do the maintainers expect a release that has this included to be released?

@srikanthccv
Copy link
Member

A periodic monthly release is happening this week (or early next week).

@hasier
Copy link
Contributor Author

hasier commented Jan 12, 2023

@srikanthccv I had to update the branch with changes from main, please re-approve and merge whenever you have a few mins, as I don't have permissions to do so as a first-time contributor 😊🙏

@srikanthccv srikanthccv enabled auto-merge (squash) January 12, 2023 16:34
@srikanthccv srikanthccv merged commit 092d8c8 into open-telemetry:main Jan 12, 2023
@hasier hasier deleted the fix-asgi-keys branch January 13, 2023 09:56
james-emerton pushed a commit to james-emerton/opentelemetry-python-contrib that referenced this pull request Jan 19, 2023
@Lawouach
Copy link

Lawouach commented Feb 9, 2023

Hello @srikanthccv, unless I'm mistaken this has not been released yet right? Any chance this gets bumped as it's a bit of a challenge without this fix out. Thanks :)

@hasier
Copy link
Contributor Author

hasier commented Feb 9, 2023

Hello @srikanthccv, unless I'm mistaken this has not been released yet right? Any chance this gets bumped as it's a bit of a challenge without this fix out. Thanks :)

afaik it's not been released indeed, last published version seems to be v0.36b0, which still contains the incorrect behaviour. In case it helps, while no new version is out, in our repos we have temporarily addressed it by overriding the default asgi getter.

from opentelemetry.instrumentation import asgi

class ASGIGetter(asgi.ASGIGetter):
    """
    Override default getter implementation while `.keys()` fix is not out
    https://github.com/open-telemetry/opentelemetry-python-contrib/pull/1435
    """
    def keys(self, carrier: dict[str, typing.Any]) -> list[str]:
        headers = carrier.get("headers") or []
        return [_key.decode("utf8") for (_key, _value) in headers]

def setup_tracing() -> None:
    # Override default asgi_getter while `.keys()` fix is not out
    # https://github.com/open-telemetry/opentelemetry-python-contrib/pull/1435
    asgi.asgi_getter = ASGIGetter()
    ...

@Lawouach
Copy link

Lawouach commented Feb 9, 2023

Thanks @hasier

A sound proposal that I'll have to review. I'm not using the ASGI extension directly but through FastAPI. Cheers.

@theobouwman
Copy link

Is this fix already merged @hasier ?

@hasier
Copy link
Contributor Author

hasier commented Nov 6, 2023

Is this fix already merged @hasier ?

@theobouwman the fix was merged on January 12 and included as part of Version 1.16.0/0.37b0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ASGI Instrumentation ValueError: too many values to unpack (expected 2) during JaegerPropagator
8 participants