Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add outcome to transactions and spans #299

Merged
merged 9 commits into from
Aug 24, 2020

Conversation

felixbarny
Copy link
Member

@felixbarny felixbarny commented Jul 17, 2020

Background
In APM UI we want to add a chart displaying the error rate. Currently we are calculating the rate as the ratio of errors to transactions in a given time range. But since a transaction can have multiple errors and we can have errors outside of transactions we can not calculate it like that.

Solution
A solution would be having a flag on the transaction informing if that transaction is erroneous. And to calculate the Error rate we'd do: total_number_of_transactions_with_errors / total_number_of_transactions in a given time range.

supersedes discussion issue: #281

Agent Milestone Link to agent implementation issue
.NET 7.10 elastic/apm-agent-dotnet#940
Go 7.10 elastic/apm-agent-go#799
Java 7.10 elastic/apm-agent-java#1354
Node.js ? elastic/apm-agent-nodejs#1814
PHP ? elastic/apm-agent-php#139
Python 7.10 elastic/apm-agent-python#899 (reference impl)
Ruby 7.10 elastic/apm-agent-ruby#852
RUM 7.10 elastic/apm-agent-rum-js#876

docs/agents/agent-development.md Outdated Show resolved Hide resolved
docs/agents/agent-development.md Outdated Show resolved Hide resolved
Copy link
Member

@axw axw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! IMO we should maintain full ECS fidelity, and allow success/failed/unknown/(unspecified).

I think we might want to define outcome for HTTP client spans as well.

docs/agents/agent-development.md Outdated Show resolved Hide resolved
docs/agents/agent-development.md Outdated Show resolved Hide resolved
docs/agents/agent-development.md Outdated Show resolved Hide resolved
@felixbarny felixbarny added this to the 7.10 milestone Jul 27, 2020
An outcome always makes sense for span or transaction events but sometimes it can't be determined.
Added outcome for http exit spans
Clarify that client errors are still a success for transactions
@felixbarny felixbarny requested review from webmat and axw July 30, 2020 11:38
docs/agents/agent-development.md Outdated Show resolved Hide resolved
docs/agents/agent-development.md Outdated Show resolved Hide resolved
docs/agents/agent-development.md Outdated Show resolved Hide resolved
Co-authored-by: Mathieu Martin <webmat@gmail.com>
@felixbarny felixbarny marked this pull request as ready for review August 11, 2020 07:42
@felixbarny felixbarny requested review from a team as code owners August 11, 2020 07:42
docs/agents/agent-development.md Outdated Show resolved Hide resolved
docs/agents/agent-development.md Outdated Show resolved Hide resolved
beniwohli added a commit to beniwohli/apm-agent-python that referenced this pull request Aug 11, 2020
This implements elastic/apm#299.

Additionally, the "status_code" attribute has been added to HTTP spans.
specs/agents/error-tracking.md Show resolved Hide resolved
@@ -4,8 +4,10 @@ Agents should instrument HTTP request routers/handlers, starting a new transacti

- The transaction `type` should be `request`.
- The transaction `result` should be `HTTP Nxx`, where N is the first digit of the status code (e.g. `HTTP 4xx` for a 404)
- The transaction `outcome` should be `"success"` for HTTP status codes < 500 and `"failure"` for status codes >= 500. \
Status codes in the 4xx range (client errors) are not considered a `failure` as the failure has not been caused by the application itself but by the caller.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be the default behavior but we should allow users to capture 4xx errors as errors as some users for e.g. may want to capture 401/403 as errors.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it enough to just offer the API for now? I'd wait to add another config option until we actually get requests for that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added APIs for setting the outcome in the Python implementation: elastic/apm-agent-python@ce13f92b63

So a user could do this somewhere in their code if they determine that the transaction should be considered failed:

import elasticapm
elasticapm.set_transaction_failure()

- `http.url` (the target URL) \
The captured URL should have the userinfo (username and password), if any, redacted.
- `http.status_code` (the response status code) \
The span's `outcome` should be set to `"success"` if the status code is lower than 400 and to `"failure"` otherwise.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comments as above to allow flexibility to users to customize the outcomes based on what they see as errors.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, it seems like for spans we are considering an erroneous outcome if < 400 and for transactions it is >500.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's >= 400 for spans (includes client errors) and >= 500 for transactions (does not include client errors)

specs/agents/tracing-transactions.md Show resolved Hide resolved
Copy link

@nehaduggal nehaduggal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

++ on the API to set outcomes.

@felixbarny
Copy link
Member Author

We've got all the required approvals now. This is scheduled to be merged next Monday unless there are objections.

@apmmachine
Copy link

apmmachine commented Aug 19, 2020

💔 Build Failed

Pipeline View Test View Changes Artifacts preview

Expand to view the summary

Build stats

  • Build Cause: [Started by timer]

  • Start Time: 2020-08-24T04:40:00.670+0000

  • Duration: 3 min 48 sec

Steps errors

Expand to view the steps failures

  • Name: Shell Script
    • Description: [2020-08-24T04:42:47.557Z] + git diff --name-only c00acea093440e05e56877684a6ce53370f96054...b0a4652

    • Duration: 0 min 0 sec

    • Start Time: 2020-08-24T04:42:47.266+0000

    • log

Log output

Expand to view the last 100 lines of log output

[2020-08-24T04:40:26.155Z] Still waiting to schedule task
[2020-08-24T04:40:26.156Z] ‘apm-ci-immutable-ubuntu-1804-1598243649304450156’ is offline
[2020-08-24T04:42:09.687Z] Running on apm-ci-immutable-ubuntu-1804-1598244022338403038 in /var/lib/jenkins/workspace/ared_apm-update-specs-mbp_PR-299
[2020-08-24T04:42:09.898Z] �[39;49m[INFO] Override default checkout�[0m
[2020-08-24T04:42:09.948Z] Sleeping for 10 sec
[2020-08-24T04:42:22.749Z] using credential f6c7695a-671e-4f4f-a331-acdce44ff9ba
[2020-08-24T04:42:22.769Z] Wiping out workspace first.
[2020-08-24T04:42:22.803Z] Cloning the remote Git repository
[2020-08-24T04:42:22.803Z] Using shallow clone with depth 4
[2020-08-24T04:42:22.803Z] Avoid fetching tags
[2020-08-24T04:42:22.834Z] Cloning repository git@github.com:elastic/apm.git
[2020-08-24T04:42:22.884Z]  > git init /var/lib/jenkins/workspace/ared_apm-update-specs-mbp_PR-299 # timeout=10
[2020-08-24T04:42:22.953Z] Fetching upstream changes from git@github.com:elastic/apm.git
[2020-08-24T04:42:22.953Z]  > git --version # timeout=10
[2020-08-24T04:42:22.958Z]  > git --version # 'git version 2.17.1'
[2020-08-24T04:42:22.959Z] using GIT_SSH to set credentials GitHub user @elasticmachine SSH key
[2020-08-24T04:42:22.991Z]  > git fetch --no-tags --progress -- git@github.com:elastic/apm.git +refs/heads/*:refs/remotes/origin/* # timeout=15
[2020-08-24T04:42:23.744Z] Cleaning workspace
[2020-08-24T04:42:23.809Z] Using shallow fetch with depth 4
[2020-08-24T04:42:23.809Z] Pruning obsolete local branches
[2020-08-24T04:42:24.465Z] Merging remotes/origin/master commit 0c78d54ebc748c6593c475fb97f1e2adec0977db into PR head commit 362e2a1825788087d6128a271728d6c986b87662
[2020-08-24T04:42:24.575Z] Merge succeeded, producing b0a4652280ff4ac75e4a8ab8a9a1d0685e378666
[2020-08-24T04:42:24.576Z] Checking out Revision b0a4652280ff4ac75e4a8ab8a9a1d0685e378666 (PR-299)
[2020-08-24T04:42:23.703Z]  > git config remote.origin.url git@github.com:elastic/apm.git # timeout=10
[2020-08-24T04:42:23.713Z]  > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10
[2020-08-24T04:42:23.730Z]  > git config remote.origin.url git@github.com:elastic/apm.git # timeout=10
[2020-08-24T04:42:23.749Z]  > git rev-parse --verify HEAD # timeout=10
[2020-08-24T04:42:23.774Z] No valid HEAD. Skipping the resetting
[2020-08-24T04:42:23.774Z]  > git clean -fdx # timeout=10
[2020-08-24T04:42:23.816Z] Fetching upstream changes from git@github.com:elastic/apm.git
[2020-08-24T04:42:23.816Z] using GIT_SSH to set credentials GitHub user @elasticmachine SSH key
[2020-08-24T04:42:23.833Z]  > git fetch --no-tags --progress --prune -- git@github.com:elastic/apm.git +refs/pull/299/head:refs/remotes/origin/PR-299 +refs/heads/master:refs/remotes/origin/master # timeout=15
[2020-08-24T04:42:24.474Z]  > git config core.sparsecheckout # timeout=10
[2020-08-24T04:42:24.484Z]  > git checkout -f 362e2a1825788087d6128a271728d6c986b87662 # timeout=15
[2020-08-24T04:42:24.519Z]  > git remote # timeout=10
[2020-08-24T04:42:24.532Z]  > git config --get remote.origin.url # timeout=10
[2020-08-24T04:42:24.539Z] using GIT_SSH to set credentials GitHub user @elasticmachine SSH key
[2020-08-24T04:42:24.543Z]  > git merge 0c78d54ebc748c6593c475fb97f1e2adec0977db # timeout=10
[2020-08-24T04:42:24.566Z]  > git rev-parse HEAD^{commit} # timeout=10
[2020-08-24T04:42:24.580Z]  > git config core.sparsecheckout # timeout=10
[2020-08-24T04:42:24.584Z]  > git checkout -f b0a4652280ff4ac75e4a8ab8a9a1d0685e378666 # timeout=15
[2020-08-24T04:42:28.207Z] Commit message: "Merge commit '0c78d54ebc748c6593c475fb97f1e2adec0977db' into HEAD"
[2020-08-24T04:42:28.227Z] First time build. Skipping changelog.
[2020-08-24T04:42:28.227Z] Cleaning workspace
[2020-08-24T04:42:28.212Z]  > git rev-list --no-walk a5a8c4b56e57d6561130452baa592b6f3a6f12e3 # timeout=10
[2020-08-24T04:42:28.231Z]  > git rev-parse --verify HEAD # timeout=10
[2020-08-24T04:42:28.252Z] Resetting working tree
[2020-08-24T04:42:28.252Z]  > git reset --hard # timeout=10
[2020-08-24T04:42:28.265Z]  > git clean -fdx # timeout=10
[2020-08-24T04:42:29.042Z] Masking supported pattern matches of $JOB_GCS_BUCKET or $NOTIFY_TO
[2020-08-24T04:42:29.072Z] Timeout set to expire in 3 hr 0 min
[2020-08-24T04:42:29.080Z] The timestamps step is unnecessary when timestamps are enabled for all Pipeline builds.
[2020-08-24T04:42:29.377Z] [INFO] 'shallow' is forced to be disabled when running on PullRequests
[2020-08-24T04:42:29.387Z] Running in /var/lib/jenkins/workspace/ared_apm-update-specs-mbp_PR-299/src/github.com/elastic/apm
[2020-08-24T04:42:29.397Z] [INFO] gitCheckout: Checkout master from git@github.com:elastic/apm.git with credentials f6c7695a-671e-4f4f-a331-acdce44ff9ba
[2020-08-24T04:42:29.416Z] [INFO] Override default checkout
[2020-08-24T04:42:29.440Z] Sleeping for 10 sec
[2020-08-24T04:42:39.579Z] using credential f6c7695a-671e-4f4f-a331-acdce44ff9ba
[2020-08-24T04:42:39.604Z] Cloning the remote Git repository
[2020-08-24T04:42:39.621Z] Cloning repository git@github.com:elastic/apm.git
[2020-08-24T04:42:39.648Z]  > git init /var/lib/jenkins/workspace/ared_apm-update-specs-mbp_PR-299/src/github.com/elastic/apm # timeout=10
[2020-08-24T04:42:39.654Z] Fetching upstream changes from git@github.com:elastic/apm.git
[2020-08-24T04:42:39.654Z]  > git --version # timeout=10
[2020-08-24T04:42:39.660Z]  > git --version # 'git version 2.17.1'
[2020-08-24T04:42:39.660Z] using GIT_SSH to set credentials GitHub user @elasticmachine SSH key
[2020-08-24T04:42:39.665Z]  > git fetch --tags --progress -- git@github.com:elastic/apm.git +refs/heads/*:refs/remotes/origin/* # timeout=10
[2020-08-24T04:42:40.326Z]  > git config remote.origin.url git@github.com:elastic/apm.git # timeout=10
[2020-08-24T04:42:40.330Z]  > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10
[2020-08-24T04:42:40.338Z]  > git config remote.origin.url git@github.com:elastic/apm.git # timeout=10
[2020-08-24T04:42:40.347Z] Fetching upstream changes from git@github.com:elastic/apm.git
[2020-08-24T04:42:40.348Z] using GIT_SSH to set credentials GitHub user @elasticmachine SSH key
[2020-08-24T04:42:40.351Z]  > git fetch --tags --progress -- git@github.com:elastic/apm.git +refs/heads/*:refs/remotes/origin/* +refs/pull/*/head:refs/remotes/origin/PR/* # timeout=10
[2020-08-24T04:42:41.231Z] Checking out Revision 0c78d54ebc748c6593c475fb97f1e2adec0977db (origin/master)
[2020-08-24T04:42:41.262Z] Commit message: "Add SQL parsing performance examples (#186)"
[2020-08-24T04:42:41.262Z] First time build. Skipping changelog.
[2020-08-24T04:42:41.228Z]  > git rev-parse origin/master^{commit} # timeout=10
[2020-08-24T04:42:41.237Z]  > git config core.sparsecheckout # timeout=10
[2020-08-24T04:42:41.240Z]  > git checkout -f 0c78d54ebc748c6593c475fb97f1e2adec0977db # timeout=10
[2020-08-24T04:42:42.248Z] Masking supported pattern matches of $GIT_USERNAME or $GIT_PASSWORD
[2020-08-24T04:42:42.903Z] + git fetch https://****:****@github.com/elastic/apm.git +refs/pull/*/head:refs/remotes/origin/pr/*
[2020-08-24T04:42:42.961Z] Archiving artifacts
[2020-08-24T04:42:43.626Z] + git rev-parse HEAD
[2020-08-24T04:42:43.986Z] + git rev-parse HEAD
[2020-08-24T04:42:44.301Z] + git rev-parse origin/pr/299
[2020-08-24T04:42:44.335Z] [INFO] githubEnv: Found Git Build Cause: pr
[2020-08-24T04:42:44.780Z] Masking supported pattern matches of $GITHUB_TOKEN
[2020-08-24T04:42:45.956Z] [INFO] githubPrCheckApproved: Title: Add outcome to transactions and spans - User: felixbarny - Author Association: MEMBER
[2020-08-24T04:42:46.667Z] Stashed 354 file(s)
[2020-08-24T04:42:47.136Z] Running in /var/lib/jenkins/workspace/ared_apm-update-specs-mbp_PR-299/src/github.com/elastic/apm
[2020-08-24T04:42:47.557Z] + git diff --name-only c00acea093440e05e56877684a6ce53370f96054...b0a4652280ff4ac75e4a8ab8a9a1d0685e378666
[2020-08-24T04:42:47.557Z] fatal: Invalid symmetric difference expression c00acea093440e05e56877684a6ce53370f96054...b0a4652280ff4ac75e4a8ab8a9a1d0685e378666
[2020-08-24T04:42:47.613Z] Stage "Send Pull Request for BDD specs" skipped due to earlier failure(s)
[2020-08-24T04:42:47.634Z] Stage "Send Pull Request for JSON specs" skipped due to earlier failure(s)
[2020-08-24T04:42:47.819Z] Running on Jenkins in /var/lib/jenkins/workspace/ared_apm-update-specs-mbp_PR-299
[2020-08-24T04:42:47.896Z] [INFO] getVaultSecret: Getting secrets
[2020-08-24T04:42:47.960Z] Masking supported pattern matches of $VAULT_ADDR or $VAULT_ROLE_ID or $VAULT_SECRET_ID
[2020-08-24T04:42:48.548Z] + chmod 755 generate-build-data.sh
[2020-08-24T04:42:48.548Z] + ./generate-build-data.sh https://apm-ci.elastic.co/blue/rest/organizations/jenkins/pipelines/apm-shared/apm-update-specs-mbp/PR-299/ https://apm-ci.elastic.co/blue/rest/organizations/jenkins/pipelines/apm-shared/apm-update-specs-mbp/PR-299/runs/4 FAILURE 167619
[2020-08-24T04:42:49.099Z] INFO: curl https://apm-ci.elastic.co/blue/rest/organizations/jenkins/pipelines/apm-shared/apm-update-specs-mbp/PR-299/runs/4/steps/?limit=10000 -o steps-info.json
[2020-08-24T04:42:49.650Z] INFO: curl https://apm-ci.elastic.co/blue/rest/organizations/jenkins/pipelines/apm-shared/apm-update-specs-mbp/PR-299/runs/4/tests/?status=FAILED -o tests-errors.json

@felixbarny felixbarny merged commit 9c6dd55 into elastic:master Aug 24, 2020
@felixbarny felixbarny deleted the event.outcome branch August 24, 2020 09:19
beniwohli added a commit to elastic/apm-agent-python that referenced this pull request Nov 17, 2020
* handle transaction_ignore_urls setting [WIP]

* added framework specific tests

(and fixed issues revealed by the tests...)

* implement "outcome" property for transactions and spans

This implements elastic/apm#299.

Additionally, the "status_code" attribute has been added to HTTP spans.

* fix some tests

* change default outcome for spans to "unknown"

* added API functions for setting transaction outcome

* rework outcome API, and make sure it works for unsampled transactions

* fix some tests

* fix a test and add one for testing override behavior

* add an override=False that went forgotten

* expand docs a bit

* implement transaction_ignore_urls [WIP]

* do less work in aiohttp if we're not tracing a transaction

* construct path to json tests in a platform independent way

* fix merge issues

* address review
beniwohli added a commit to beniwohli/apm-agent-python that referenced this pull request Sep 14, 2021
* handle transaction_ignore_urls setting [WIP]

* added framework specific tests

(and fixed issues revealed by the tests...)

* implement "outcome" property for transactions and spans

This implements elastic/apm#299.

Additionally, the "status_code" attribute has been added to HTTP spans.

* fix some tests

* change default outcome for spans to "unknown"

* added API functions for setting transaction outcome

* rework outcome API, and make sure it works for unsampled transactions

* fix some tests

* fix a test and add one for testing override behavior

* add an override=False that went forgotten

* expand docs a bit

* implement transaction_ignore_urls [WIP]

* do less work in aiohttp if we're not tracing a transaction

* construct path to json tests in a platform independent way

* fix merge issues

* address review
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.