Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python: New Feature: token usage in metrics #9909

Closed
Druid-of-Luhn opened this issue Dec 9, 2024 · 4 comments
Closed

Python: New Feature: token usage in metrics #9909

Druid-of-Luhn opened this issue Dec 9, 2024 · 4 comments
Assignees
Labels
python Pull requests for the Python Semantic Kernel

Comments

@Druid-of-Luhn
Copy link


name: Feature request
about: Suggest an idea for this project


Currently, the only metrics produced by Semantic Kernel (at least in Python) are function call durations.

The observability documentation does indeed mention this:

Telemetry Description
Metric Semantic Kernel captures the following metrics from kernel functions:
  • semantic_kernel.function.invocation.duration (Histogram) - function execution time (in seconds)
  • semantic_kernel.function.streaming.duration (Histogram) - function streaming execution time (in seconds)

However, above that it also says (emphasis mine):

Metrics: Semantic Kernel emits metrics from kernel functions and AI connectors. You will be able to monitor metrics such as the kernel function execution time, the token consumption of AI connectors, etc.

From that I understand that the feature is planned but not yet implemented. It feels like token usage would be a very helpful one to get as a metric (I can see that it is already logged), since it is a common question for solutions that use LLMs.

For the moment, is it possible for me to create some kind of filter or something and track the metric myself? Or does this depend on a more internal implementation?

I have also seen this issue: #6489

@markwallace-microsoft markwallace-microsoft added python Pull requests for the Python Semantic Kernel triage labels Dec 9, 2024
@github-actions github-actions bot changed the title New Feature: token usage in metrics Python: New Feature: token usage in metrics Dec 9, 2024
@alliscode
Copy link
Member

@TaoChenOSU could you look at this please.

@TaoChenOSU
Copy link
Contributor

@Druid-of-Luhn Thank you for bringing this up!

Currently in Python, we are not tracking the token usage per connector as a metric. In .Net, it depends on the specific implementation of the connector. For example, the OpenAI connector emits 3 metrics to track token consumption while the Bedrock connector doesn't.

To answer your question on if it's possible for you to create the metrics yourself, the answer is yes. We do include the token usage information (when it's available) as part of the metadata on the return object when you call a chat completion service. Depending on how you use the AI service, you can create the metrics differently. If you are calling the service directly, you can create the metrics along with where you are calling the service. If you are using a kernel function, you can create the metrics in a filter.

Feel free to post further questions :)

@Druid-of-Luhn
Copy link
Author

Thanks, I am only using Azure OpenAI models at this stage, so I will add the necessary calls/filters.

Are or will there be plans to implement this in Python in the future, to bring it to parity with .NET?

@TaoChenOSU
Copy link
Contributor

Yes, we do have plans to reach parity: #6750

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
python Pull requests for the Python Semantic Kernel
Projects
Status: No status
Development

No branches or pull requests

4 participants