Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add suppression for input and output data #5

Merged
merged 3 commits into from
Aug 17, 2023
Merged

Conversation

cartermp
Copy link
Owner

@cartermp cartermp commented Jul 26, 2023

Adds two config parameters:

  • suppress_response_data, which when set to True, will NOT log response data (like chat responses) to a span
  • suppress_input_content, which when set to True, will NOT log input data (like a corpus of text passed to a model with a large context window) to a span

This lets people avoid reaching limits on spans. Imagine an app that passes in a large amount of data per request, or a long conversation in a chatbot where the entire conversation history is passed in as input each time -- this would likely exceed the total size a span can have.

@cartermp cartermp requested a review from estib July 26, 2023 01:37
Copy link
Collaborator

@estib estib left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good -- although i am interested in talking through one question:

Will people be able to predict when content / responses are likely to be too long at the time they are initiating the openai OTEL auto instrumentor? Or will it be something a bit more dynamic, like "sometimes i get a few really long questions / responses, but sometimes not" depending on the users?

What do you think about adding a config to define dynamic rules of when a response_data or input_content should be included? Like maybe we could let users define some kind of length limit for how long a response_data should be before it gets suppressed? This could be something we do in addition to a boolean on/off switch like what this PR adds, but it could also be something we do instead? (could even allow users to set the length limit to 0 to suppress them entirely)

I dunno, what do you think?

src/opentelemetry/instrumentation/openai/__init__.py Outdated Show resolved Hide resolved
src/opentelemetry/instrumentation/openai/__init__.py Outdated Show resolved Hide resolved
@cartermp
Copy link
Owner Author

cartermp commented Aug 5, 2023

Mmmm, this is a good question:

Will people be able to predict when content / responses are likely to be too long at the time they are initiating the openai OTEL auto instrumentor? Or will it be something a bit more dynamic, like "sometimes i get a few really long questions / responses, but sometimes not" depending on the users?

You can certainly predict it for single requests, or at least a range of I/O size. But yeah, for genuine chat apps, there's no predicting things, nor is there predicting things for agents where you continually build larger and larger context.

I think a setting that's more dynamic in setting, limiting total inputs and/or outputs by a threshold does make sense.

@cartermp cartermp merged commit aa651de into main Aug 17, 2023
1 check passed
@cartermp cartermp deleted the cartermp/suppress branch August 17, 2023 19:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants