Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes and Improvements to botocore instrumentation #150

Merged

Conversation

NathanielRN
Copy link
Contributor

@NathanielRN NathanielRN commented Nov 5, 2020

Description

This PR aims to update the botocore instrumentation by doing the following:

Fixes:

  • Removed code that modified span.resource since resource should be immutable
  • Change SpanKind from CONSUMER to CLIENT because botocore api calls can be considered remote outgoing and synchronous requests requests as laid out in the SpanKind specifications - Important so that AWS services are treated as separate services in Tracing Backends. (See picture below!)
  • Use instrumentation.utils.unwrap instead of creating the function again

Simplifications:

  • Remove code that searches for nested attributes using deep_getattr and instead pulls values from properties of the botocore.BaseClient class since they should always be there
  • Removed params from attributes. There is a PR to add more specific semantic conventions for which params should be trace in AWS SDKs in the specifications so the plan was to revisit when it gets merged

Improvements:

  • More comprehensive logic to parse out request_id if it can be found in the headers instead of the usual location. Just like .NET botocore instrumentation
  • Add aws.table_name and aws.queue_url for DynamoDB and SQS specifically as traced properties (this will be added in the specifications)
  • Format tests checking of attributes to look nice!

Type of change

Please delete options that are not relevant.

  • Breaking change (fix or feature that would cause existing functionality to not work as expected) - Not all params are recorded now
  • This change requires a documentation update

How Has This Been Tested?

Please describe the tests that you ran to verify your changes. Provide instructions so we can reproduce. Please also list any relevant details for your test configuration

  • With the change to SpanKind.CLIENT, S3 (and other services) are considered their own services

Before:
image

After:
image

Checklist:

  • Followed the style guidelines of this project
  • Changelogs have been updated
  • Unit tests have been added
  • Documentation has been updated

@NathanielRN NathanielRN requested review from a team, owais and codeboten and removed request for a team November 5, 2020 21:19
@NathanielRN NathanielRN force-pushed the update-botocore-instrumentation branch from e566384 to f098c5c Compare November 5, 2020 21:22
@NathanielRN NathanielRN marked this pull request as draft November 5, 2020 23:42
@NathanielRN NathanielRN force-pushed the update-botocore-instrumentation branch 7 times, most recently from 27fbdaa to 2473eff Compare November 7, 2020 02:34
)
self.assertEqual(spans[1].attributes["params.Key"], str(params["Key"]))
self.assertTrue("params.Body" not in spans[1].attributes.keys())
get_object_attributes = dict(spans[2].attributes)
self.assertEqual(
Copy link
Contributor Author

@NathanielRN NathanielRN Nov 7, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure why RequestId doesn't return here, but I raised an issue to ask if my setup is wrong.

getmoto/moto#3442

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got a reply confirming that the RequestId is missing because its simply not added to the S3 "mock" of the service in moto, so this is the best we can do right now :)

@NathanielRN NathanielRN marked this pull request as ready for review November 7, 2020 02:46
@NathanielRN NathanielRN force-pushed the update-botocore-instrumentation branch 3 times, most recently from cb661e3 to b1ac40f Compare November 12, 2020 23:13
@NathanielRN NathanielRN changed the title Simplify attributes for botocore instrumentation Fixes and Improvements to botocore instrumentation Nov 13, 2020
@NathanielRN NathanielRN force-pushed the update-botocore-instrumentation branch from b1ac40f to b423b0c Compare November 13, 2020 18:45
{
"aws.operation": "DescribeInstances",
"aws.region": "us-west-2",
"aws.request_id": "fdcdcab1-ae5c-489e-9c33-4637c5dda355",
Copy link
Contributor Author

@NathanielRN NathanielRN Nov 13, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure why the IDs are sometimes consistent and sometimes not with moto.

Sometimes the ID is consistent like this EC2 call:

{
    'Reservations': [],
    'ResponseMetadata': {'RequestId': 'fdcdcab1-ae5c-489e-9c33-4637c5dda355',
    'HTTPStatusCode': 200,
    'HTTPHeaders': {'server': 'amazon.com'},
    'RetryAttempts': 0}
}

and other times it's randomly generated like in this SQS call:

{
    'ResponseMetadata': {
        'RequestId': 'JWH6CYKFXYT2KXQSAYY50T6FK558XMQ7U9SOWLJ5YEQZOKULQ012',
        'HTTPStatusCode': 200,
        'HTTPHeaders': {'server': 'amazon.com',
        'x-amzn-requestid': 'JWH6CYKFXYT2KXQSAYY50T6FK558XMQ7U9SOWLJ5YEQZOKULQ012',
        'x-amz-crc32': '1416772655'},
        'RetryAttempts': 0
    }
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any insights on why RequestId looks different on SQS ? my guess would be they are not generated using same lib or similar standards.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes! I looked into it, and found this code used to generate the request ID in the moto repository:

https://github.com/spulec/moto/blob/9db62d32bf70e18f315305b7915d199e3ba1210a/moto/core/utils.py#L238-L244

It was explained to me that this function is used by some of the resources to attach a request ID.

Notably the amzn_request_id() decorator is used on SQS and DynamoDB and that's why their request IDs look like this 🙂

And some services have hardcoded requests like EC2 DescribeInstances (https://github.com/spulec/moto/blob/50d71eccbee12ce5cf2c1ecf61bfcbd60317976d/moto/ec2/responses/instances.py#L479)

I'm not sure why it doesn't format the request IDs in the expected way (lower case joined with hyphens) because it used to be correct.

In that case, should I remove the regex check and just make it a check for the key? Let me know what you think!

{
"aws.operation": "DescribeInstances",
"aws.region": "us-west-2",
"aws.request_id": "fdcdcab1-ae5c-489e-9c33-4637c5dda355",

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any insights on why RequestId looks different on SQS ? my guess would be they are not generated using same lib or similar standards.

("action", "params", "path", "verb"),
{"params", "path", "verb"},
)
error = None

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

declare these values at the top ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure can do. I guess I thought it would be better to leave them down here though since they don't need to be in scope at the top?

span.attributes,
{
"aws.operation": "ListObjects",
"aws.region": "us-west-2",

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason on why don't we validate status code?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because this is a ParamValidationError the botocore library will never send the HTTP request and so there's no http.status_code to check!

But that's a good point! I can add a check to make sure this span was recorded as an error.

self.assertIs(
    span.status.status_code, trace_api.status.StatusCode.ERROR,
)

@NathanielRN NathanielRN force-pushed the update-botocore-instrumentation branch from b423b0c to 9b90b01 Compare November 14, 2020 01:22
}
if isinstance(dict_, dict)
else {prefix: dict_}
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: looks like these functions don't depend on anything from the outer context and take everything they need as arguments so do we really need them to be defined in line?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably be moved into opentelemetry.instrumentation.utils anyways. I'll keep the Stack Overflow comment and add a comment about it being better suited as a util function for now.

if not span.is_recording():
return

if endpoint_name not in {"kms", "sts"}:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a comment here explaining why this check exists would be nice.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I just copied this code because I didn't want to change the boto functionality while updating botocore. I can only guess that it's because these are the Key Management Service and Secure Token Service services so maybe their API trace calls should not be traced?

I added a comment mentioning that but I can't guess pass that. It doesn't look like I the commit history says who added these in the first place either.

for key, value in {
k: truncate_arg_value(v)
for k, v in tags.items()
if k not in {"s3": ["params.Body"]}.get(endpoint_name, [])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bit cryptic and intention might not be very clear. Mind breaking it down or simplifying? Looks we really want to check something like if endpoint_name == 's3' and k == 'params.Body'? Personally, I'd write it out as a regular loop clearly self-documenting the intent and separating logic to multiple lines.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really agree with you, thanks for the tip! I've updated it to look cleaner, please let me know what you think!

@NathanielRN NathanielRN requested a review from owais November 16, 2020 19:23
@NathanielRN NathanielRN force-pushed the update-botocore-instrumentation branch 2 times, most recently from 0144715 to 1acf758 Compare November 17, 2020 20:50
Copy link

@bhautikpip bhautikpip left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM :)

@NathanielRN NathanielRN force-pushed the update-botocore-instrumentation branch 2 times, most recently from c9251c3 to 9fe220e Compare November 17, 2020 22:21
@NathanielRN NathanielRN force-pushed the update-botocore-instrumentation branch from 9fe220e to 467e513 Compare November 18, 2020 17:15
@codeboten codeboten merged commit fd493f4 into open-telemetry:master Nov 18, 2020
@NathanielRN NathanielRN deleted the update-botocore-instrumentation branch November 25, 2020 18:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants