Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Transactions started with continue_trace() do not inherit sampling #2990

Closed
jwhitaker-gridcog opened this issue Apr 18, 2024 · 6 comments
Closed
Assignees

Comments

@jwhitaker-gridcog
Copy link

jwhitaker-gridcog commented Apr 18, 2024

How do you use Sentry?

Sentry Saas (sentry.io)

Version

1.45.0

Steps to Reproduce

Run the following python script, setting SENTRY_DSN accordingly.

Alternatively, clone from https://github.com/jwhitaker-gridcog/sentry-bug .

Python script
import subprocess
import sentry_sdk
import multiprocessing
import argparse
import sys
import os

# set SENTRY_DSN in env


# worker: this worker is used in two demos: first by multiprocessing, second by subprocess.
def worker(baggage_and_tp):
    sentry_sdk.init(traces_sample_rate=1.0, environment="jarrad-local", debug=True)
    baggage, traceparent = baggage_and_tp
    try:
        with sentry_sdk.continue_trace(
            {"baggage": baggage, "sentry-trace": traceparent}, op="task", name="child"
        ) as worker_tx:
            # # workaround:
            # if worker_tx.parent_sampled:
            #     worker_tx.sampled = True
            #     worker_tx.init_span_recorder(1000)
            sentry_sdk.capture_message("hi")
    finally:
        sentry_sdk.Hub.current.flush()


# demo 1: worker is started with multiprocessing.Process(). baggage is passed with multiprocessing magic.
def main_multiprocessing():
    sentry_sdk.init(traces_sample_rate=1.0, environment="jarrad-local", debug=True)

    with sentry_sdk.start_transaction(op="task", name="parent") as tx:
        baggage, traceparent = sentry_sdk.get_baggage(), sentry_sdk.get_traceparent()

        proc = multiprocessing.Process(
            target=worker, name="worker", args=[(baggage, traceparent)]
        )
        proc.start()
        proc.join()


# helper for demo 2: pull baggage from env and pass to worker.
def worker_subprocess():
    baggage, traceparent = (
        os.environ["SENTRY_BAGGAGE"],
        os.environ["SENTRY_TRACEPARENT"],
    )
    worker((baggage, traceparent))


# demo 2: worker is started with subprocess. baggage is passed with env vars.
def main_subprocess():
    sentry_sdk.init(traces_sample_rate=1.0, environment="jarrad-local", debug=True)

    with sentry_sdk.start_transaction(op="task", name="parent") as tx:
        baggage, traceparent = sentry_sdk.get_baggage(), sentry_sdk.get_traceparent()

        subprocess.run(
            [sys.executable, sys.argv[0], "worker_subp"],
            check=True,
            env={
                "SENTRY_BAGGAGE": baggage or "",
                "SENTRY_TRACEPARENT": traceparent or "",
            },
        )


if __name__ == "__main__":
    parser = argparse.ArgumentParser()
    parser.add_argument(
        "command",
        choices=["main_mp", "main_subp", "worker_subp"],
    )
    args = parser.parse_args()
    match args.command:
        # demo 1
        case "main_mp":
            main_multiprocessing()
        # demo 2
        case "main_subp":
            main_subprocess()
        # helper for demo 2, you shouldn't need to call this yourself
        case "worker_subp":
            worker_subprocess()
        case other:
            raise Exception(f"unreachable: {other}")

Run with demo.py main_mp or demo.py main_subp. Both demonstrate this issue in a different context.

I came across this issue using multiprocessing and suspected some strangeness around sentry_sdk.init() vs fork(), so wanted to include what I'm really doing.

However I was also able to reproduce it without multiprocessing being involved at all, and I didn't want this closed as a dupe of #291 prematurely :)

Expected Result

The worker transaction should be sent to sentry, because its parent is sampled.

According to docs at https://docs.sentry.io/platforms/python/configuration/sampling/#inheritance

Whatever a transaction's sampling decision, that decision will be passed to its child spans and from there to any transactions they subsequently cause in other services.

Actual Result

However, in logs, I can see

 [sentry] DEBUG: Setting SDK name to 'sentry.python'
 [sentry] DEBUG: [Tracing] Starting <task> transaction <parent>
...
 [sentry] DEBUG: Setting SDK name to 'sentry.python'
 [sentry] DEBUG: [Tracing] Extracted propagation context from incoming data:
... { 'dynamic_sampling_context': {
...  'trace_id': '300a4d0c53a944119ef5a2de17655f0d', 'environment': 'jarrad-local', 'release': '56f30f3b5f41cc68cae84e20b0913eb4c91cb6aa',
...  'public_key': '0166f7d1273fb9a6793b04c3b7e5b858', 'transaction': 'parent', 'sample_rate': '1.0', 'sampled': 'true'
... }, 'trace_id': '300a4d0c53a944119ef5a2de17655f0d', 'parent_span_id': 'b9c38c13eed5ca6b', 'parent_sampled': True, 'span_id': 'a3094c45744e14cd'}
 [sentry] DEBUG: Discarding transaction because sampled = False
Full logs:
$> poetry run python demo.py main_mp
 [sentry] DEBUG: Setting up integrations (with default = True)
 [sentry] DEBUG: Did not import default integration sentry_sdk.integrations.aiohttp.AioHttpIntegration: AIOHTTP not installed
 [sentry] DEBUG: Did not import default integration sentry_sdk.integrations.boto3.Boto3Integration: botocore is not installed
 [sentry] DEBUG: Did not import default integration sentry_sdk.integrations.bottle.BottleIntegration: Bottle not installed
 [sentry] DEBUG: Did not import default integration sentry_sdk.integrations.celery.CeleryIntegration: Celery not installed
 [sentry] DEBUG: Did not import default integration sentry_sdk.integrations.django.DjangoIntegration: Django not installed
 [sentry] DEBUG: Did not import default integration sentry_sdk.integrations.falcon.FalconIntegration: Falcon not installed
 [sentry] DEBUG: Did not import default integration sentry_sdk.integrations.fastapi.FastApiIntegration: Starlette is not installed
 [sentry] DEBUG: Did not import default integration sentry_sdk.integrations.flask.FlaskIntegration: Flask is not installed
 [sentry] DEBUG: Did not import default integration sentry_sdk.integrations.httpx.HttpxIntegration: httpx is not installed
 [sentry] DEBUG: Did not import default integration sentry_sdk.integrations.openai.OpenAIIntegration: OpenAI not installed
 [sentry] DEBUG: Did not import default integration sentry_sdk.integrations.pyramid.PyramidIntegration: Pyramid not installed
 [sentry] DEBUG: Did not import default integration sentry_sdk.integrations.rq.RqIntegration: RQ not installed
 [sentry] DEBUG: Did not import default integration sentry_sdk.integrations.sanic.SanicIntegration: Sanic not installed
 [sentry] DEBUG: Did not import default integration sentry_sdk.integrations.sqlalchemy.SqlalchemyIntegration: SQLAlchemy not installed.
 [sentry] DEBUG: Did not import default integration sentry_sdk.integrations.starlette.StarletteIntegration: Starlette is not installed
 [sentry] DEBUG: Did not import default integration sentry_sdk.integrations.tornado.TornadoIntegration: Tornado not installed
 [sentry] DEBUG: Setting up previously not enabled integration argv
 [sentry] DEBUG: Setting up previously not enabled integration atexit
 [sentry] DEBUG: Setting up previously not enabled integration dedupe
 [sentry] DEBUG: Setting up previously not enabled integration excepthook
 [sentry] DEBUG: Setting up previously not enabled integration logging
 [sentry] DEBUG: Setting up previously not enabled integration modules
 [sentry] DEBUG: Setting up previously not enabled integration stdlib
 [sentry] DEBUG: Setting up previously not enabled integration threading
 [sentry] DEBUG: Setting up previously not enabled integration redis
 [sentry] DEBUG: Did not enable default integration redis: Redis client not installed
 [sentry] DEBUG: Enabling integration argv
 [sentry] DEBUG: Enabling integration atexit
 [sentry] DEBUG: Enabling integration dedupe
 [sentry] DEBUG: Enabling integration excepthook
 [sentry] DEBUG: Enabling integration logging
 [sentry] DEBUG: Enabling integration modules
 [sentry] DEBUG: Enabling integration stdlib
 [sentry] DEBUG: Enabling integration threading
 [sentry] DEBUG: Setting SDK name to 'sentry.python'
 [sentry] DEBUG: [Tracing] Starting <task> transaction <parent>
 [sentry] DEBUG: [Profiling] Discarding profile because profiler was not started.
 [sentry] DEBUG: Setting up integrations (with default = True)
 [sentry] DEBUG: Did not import default integration sentry_sdk.integrations.aiohttp.AioHttpIntegration: AIOHTTP not installed
 [sentry] DEBUG: Did not import default integration sentry_sdk.integrations.boto3.Boto3Integration: botocore is not installed
 [sentry] DEBUG: Did not import default integration sentry_sdk.integrations.bottle.BottleIntegration: Bottle not installed
 [sentry] DEBUG: Did not import default integration sentry_sdk.integrations.celery.CeleryIntegration: Celery not installed
 [sentry] DEBUG: Did not import default integration sentry_sdk.integrations.django.DjangoIntegration: Django not installed
 [sentry] DEBUG: Did not import default integration sentry_sdk.integrations.falcon.FalconIntegration: Falcon not installed
 [sentry] DEBUG: Did not import default integration sentry_sdk.integrations.fastapi.FastApiIntegration: Starlette is not installed
 [sentry] DEBUG: Did not import default integration sentry_sdk.integrations.flask.FlaskIntegration: Flask is not installed
 [sentry] DEBUG: Did not import default integration sentry_sdk.integrations.httpx.HttpxIntegration: httpx is not installed
 [sentry] DEBUG: Did not import default integration sentry_sdk.integrations.openai.OpenAIIntegration: OpenAI not installed
 [sentry] DEBUG: Did not import default integration sentry_sdk.integrations.pyramid.PyramidIntegration: Pyramid not installed
 [sentry] DEBUG: Did not import default integration sentry_sdk.integrations.rq.RqIntegration: RQ not installed
 [sentry] DEBUG: Did not import default integration sentry_sdk.integrations.sanic.SanicIntegration: Sanic not installed
 [sentry] DEBUG: Did not import default integration sentry_sdk.integrations.sqlalchemy.SqlalchemyIntegration: SQLAlchemy not installed.
 [sentry] DEBUG: Did not import default integration sentry_sdk.integrations.starlette.StarletteIntegration: Starlette is not installed
 [sentry] DEBUG: Did not import default integration sentry_sdk.integrations.tornado.TornadoIntegration: Tornado not installed
 [sentry] DEBUG: Enabling integration argv
 [sentry] DEBUG: Enabling integration atexit
 [sentry] DEBUG: Enabling integration dedupe
 [sentry] DEBUG: Enabling integration excepthook
 [sentry] DEBUG: Enabling integration logging
 [sentry] DEBUG: Enabling integration modules
 [sentry] DEBUG: Enabling integration stdlib
 [sentry] DEBUG: Enabling integration threading
 [sentry] DEBUG: Setting SDK name to 'sentry.python'
 [sentry] DEBUG: [Tracing] Extracted propagation context from incoming data: {'dynamic_sampling_context': {'trace_id': '300a4d0c53a944119ef5a2de17655f0d', 'environment': 'jarrad-local', 'release': '56f30f3b5f41cc68cae84e20b0913eb4c91cb6aa', 'public_key': '0166f7d1273fb9a6793b04c3b7e5b858', 'transaction': 'parent', 'sample_rate': '1.0', 'sampled': 'true'}, 'trace_id': '300a4d0c53a944119ef5a2de17655f0d', 'parent_span_id': 'b9c38c13eed5ca6b', 'parent_sampled': True, 'span_id': 'a3094c45744e14cd'}
 [sentry] DEBUG: Discarding transaction because sampled = False
 [sentry] DEBUG: Flushing HTTP transport
 [sentry] DEBUG: background worker got flush request
 [sentry] DEBUG: Sending envelope [envelope with 1 items (error)] project:4506627083468800 host:o430711.ingest.sentry.io
 [sentry] DEBUG: Sending envelope [envelope with 1 items (internal)] project:4506627083468800 host:o430711.ingest.sentry.io
 [sentry] DEBUG: 1 event(s) pending on flush
 [sentry] DEBUG: background worker flushed
 [sentry] DEBUG: atexit: got shutdown signal
 [sentry] DEBUG: atexit: shutting down client
 [sentry] DEBUG: Flushing HTTP transport
 [sentry] DEBUG: background worker got flush request
 [sentry] DEBUG: Sending envelope [envelope with 1 items (transaction)] project:4506627083468800 host:o430711.ingest.sentry.io
 [sentry] DEBUG: background worker flushed
 [sentry] DEBUG: Killing HTTP transport
 [sentry] DEBUG: background worker got kill request

If I work around the issue with

def worker():
    with sentry_sdk.continue_trace(...) as worker_tx:
        if worker_tx.parent_sampled:
            worker_tx.sampled = True
            worker_tx.init_span_recorder(1000)

then everything works.
It looks like continue_trace() should be doing stuff that you are doing in start_transaction(), but instead just calls Transaction() on its own so this got forgotten?

@szokeasaurusrex
Copy link
Member

@jwhitaker-gridcog Thank you for reporting, I am currently investigating this issue

@szokeasaurusrex
Copy link
Member

szokeasaurusrex commented Apr 22, 2024

Hey @jwhitaker-gridcog! After some investigation, it looks like the problem you have run into is different and more general than transactions started with continue_trace not inheriting the sampling decision. If the problem was only that the sampling decision was not inherited, we would still expect the child transaction to be sent because, within the worker, you initialized the Sentry SDK with traces_sample_rate=1.0.

In fact, the root cause of the issue is that transactions must always be started with the start_transaction function in order to be sent to Sentry. Transactions created via continue_trace (or any other method, such as calling the Transaction constructor) can only be sent to sentry if started via start_transaction. This is because start_transaction initializes the transaction's span recorder, which is needed in order to send the transaction to Sentry.

In your case, this means you can work around the issue by modifying your worker function as follows:

def worker(baggage_and_tp):
    sentry_sdk.init(traces_sample_rate=1.0, environment="jarrad-local", debug=True)
    baggage, traceparent = baggage_and_tp
    try:
        transaction = sentry_sdk.continue_trace(
            {"baggage": baggage, "sentry-trace": traceparent}, op="task", name="child"
        )

        # The transaction is created with continue_trace, but we start it with start_transaction
        with sentry_sdk.start_transaction(transaction):
            sentry_sdk.capture_message("hi")
    finally:
        sentry_sdk.Hub.current.flush()

I will need to discuss with the team to determine whether requiring transactions to be started via start_transaction is the desired behavior. If yes, we will need to update the docs to make this clear, and provide a better debug message than that we are discarding the transaction because sampled=False. Otherwise, we should change the behavior so that the transaction created by continue_trace can be started directly, without passing it to start_transaction

Update: I discussed with the team, and it is indeed the expected behavior that transactions only get sent to Sentry when started with start_transaction. So, you will need to use the workaround I suggested above. We will, however, document this behavior better, especially by improving the debug message when we drop a transaction.

@szokeasaurusrex
Copy link
Member

Since we have determined that the behavior described here is as expected, I am closing this issue. I have opened the two issues below to improve the debug message and our docs, respectively:

Hopefully, these changes will help us to avoid confusion in the future!

@szokeasaurusrex szokeasaurusrex closed this as not planned Won't fix, can't repro, duplicate, stale Apr 22, 2024
szokeasaurusrex added a commit that referenced this issue Apr 22, 2024
…ransaction`

Users who enter a transaction without calling `start_transaction` likely intended to start the transaction, since without a call to `start_transaction`, their transaction will not get sent to Sentry. This warning message clarifies this behavior, and could help avoid the confusion that led to issue #2990.
szokeasaurusrex added a commit that referenced this issue Apr 22, 2024
…ransaction`

Users who enter a transaction without calling `start_transaction` likely intended to start the transaction, since without a call to `start_transaction`, their transaction will not get sent to Sentry. This warning message clarifies this behavior, and could help avoid the confusion that led to issue #2990.
@jwhitaker-gridcog
Copy link
Author

Thanks for looking into this! Can I ask what is the use case for calling continue_trace and not wanting the transaction to be sent?

@szokeasaurusrex
Copy link
Member

Thanks for looking into this! Can I ask what is the use case for calling continue_trace and not wanting the transaction to be sent?

@jwhitaker-gridcog There is likely no use case for doing this.

But essentially, as I understand it, the continue_trace function's purpose is not, and never was, to start a transaction. Its purpose is only to create a child transaction object, which inherits certain information (like the sampling decision) from the parent transaction. To then start this transaction, the transaction must be passed to the start_transaction function, which handles the logic relevant to starting a transaction, like making sure the transaction is recorded on the correct hub/scope.

Essentially, the two functions can be summarized as follows:

  • continue_trace creates a transaction, but does not start it
  • start_transaction starts the the transaction that is passed to it
    • When no transaction object is passed to start_transaction, the function automatically creates a new transaction object and starts it. In other words, start_transaction(...) is essentially just syntactic sugar for start_transaction(Transaction(...)). Perhaps, this auto-transaction-generating feature is what is causing the confusion here?

szokeasaurusrex added a commit that referenced this issue Apr 25, 2024
…ransaction`

Users who enter a transaction without calling `start_transaction` likely intended to start the transaction, since without a call to `start_transaction`, their transaction will not get sent to Sentry. This warning message clarifies this behavior, and could help avoid the confusion that led to issue #2990.
szokeasaurusrex added a commit that referenced this issue Apr 25, 2024
…ransaction`

Users who enter a transaction without calling `start_transaction` likely intended to start the transaction, since without a call to `start_transaction`, their transaction will not get sent to Sentry. This warning message clarifies this behavior, and could help avoid the confusion that led to issue #2990.
@jwhitaker-gridcog
Copy link
Author

thanks for following this up and detailed work on explanation and docs!

szokeasaurusrex added a commit that referenced this issue May 28, 2024
…ransaction`

Users who enter a transaction without calling `start_transaction` likely intended to start the transaction, since without a call to `start_transaction`, their transaction will not get sent to Sentry. This warning message clarifies this behavior, and could help avoid the confusion that led to issue #2990.

Also, add tests to ensure the message is logged.
szokeasaurusrex added a commit that referenced this issue Jun 6, 2024
…ransaction`

Users who enter a transaction without calling `start_transaction` likely intended to start the transaction, since without a call to `start_transaction`, their transaction will not get sent to Sentry. This warning message clarifies this behavior, and could help avoid the confusion that led to issue #2990.

Also, add tests to ensure the message is logged.
szokeasaurusrex added a commit that referenced this issue Jun 6, 2024
Users who enter a transaction without calling `start_transaction` likely intended to start the transaction, since without a call to `start_transaction`, their transaction will not get sent to Sentry. This warning message clarifies this behavior, and could help avoid the confusion that led to issue #2990.

Also, add tests to ensure the message is logged.
arjennienhuis pushed a commit to arjennienhuis/sentry-python that referenced this issue Sep 30, 2024
Users who enter a transaction without calling `start_transaction` likely intended to start the transaction, since without a call to `start_transaction`, their transaction will not get sent to Sentry. This warning message clarifies this behavior, and could help avoid the confusion that led to issue getsentry#2990.

Also, add tests to ensure the message is logged.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

No branches or pull requests

2 participants