-
Notifications
You must be signed in to change notification settings - Fork 66
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Add per model metrics #90
Conversation
b7f7cb9
to
86a9bf2
Compare
Hey @VedantMahabaleshwarkar, just wondering if this is replacing #86? If so, maybe that one can be closed. |
@rafvasq Yep this PR will be replacing the old one, I'll close it out. |
fe24408
to
35df353
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @VedantMahabaleshwarkar, and apologies for taking so long to look at this.
It looks like a good start but I think some more work is needed around how modelIds / vmodelIds are handled.
Currently, in the logRequestMetrics
method, you added both modelId and vmodelId args, but only ever pass in one of those.. and inside the method only choose the non-null one. So it may as well just take one arg.
However I think we should aim to always log the model id, and additionally log the vmodel id too whenever it's used.
Requests can be targeted directly at concrete models (modelId) and in this case it's pretty straightforward and the changes you've made should cover it.
But when they are targeted at a vmodelId, we should:
- Include a vmodelId for all the same metrics that we include modelId - this will require new changes to propagate the vmodelId with the request because currently it's "lost" below the top of the stack, once it's been resolved to a modelId.
- Also include the resolved modelId everywhere for all these same metrics. This is already done in all these places lower in the stack, but the top level logRequestMetrics in the ModelMeshApi class will only have the vModelId and so we'll need to make sure the resolved modelId is obtained in this same place and included as a label alongside the vmodelId.
For propagating the vMmodelId, I would suggest adding it to the Litelinks ThreadContext ... this is a ThreadLocal map which will be automatically be transferred between modelmesh pods if the resquest if forwarded.
I hope the above makes some sense, feel free to ping if not!
fac0f37
to
b111f0e
Compare
@njhill Implemented all of your suggestions. |
096c729
to
247910a
Compare
/test all |
@VedantMahabaleshwarkar: No presubmit jobs available for kserve/modelmesh@main In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
247910a
to
4f1130d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @VedantMahabaleshwarkar, added some more comments.
e8415b8
to
5d2e8fd
Compare
@njhill addressed most of your comments and avoided passing around the Other than that @danielezonca also had a small suggestion to change the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While I can't speak to the the use of FastThreadLocal
, in terms of usability, I deployed the changes myself and was able to filter metrics by model using the new parameter modelId
. I'd also recommend saving those changes to the VModelManager
class for a separate PR if it's not directly required for this one.
FYI @ckadner
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a few more nit-picks. Beyond those it looks like there are a few more unresolved conversations, beyond the "FastThreadLocal" topic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @VedantMahabaleshwarkar @danielezonca and apologies for the delay, I was out last week and still catching up.
The only thing we didn't quite get was your suggestion to use
FastThreadLocal
to store thevModelId
. It seemed redundant given that we are already adding it to theThreadContext
?
I was referring to the resolved modelId
here, not the vModelId
. And it's just for the purpose of "passing it out" of this method so that it can be included in the metrics recorded in the calling function in addition to the vModelId.
6e4feaa
to
000cce7
Compare
Hi Vedant, since you force-pushed the last commit, I cannot find where/whether Nicks suggestion made it into your last code changes. Are the remaining 7 unresolved conversations actually resolved? Also, you will need to update your branch once more :-) |
Signed-off-by: Vedant Mahabaleshwarkar <vmahabal@redhat.com>
Signed-off-by: Vedant Mahabaleshwarkar <vmahabal@redhat.com>
Signed-off-by: Vedant Mahabaleshwarkar <vmahabal@redhat.com>
Signed-off-by: Vedant Mahabaleshwarkar <vmahabal@redhat.com>
Signed-off-by: Vedant Mahabaleshwarkar <vmahabal@redhat.com>
000cce7
to
72e1c8e
Compare
Use ThreadLocal to store resolved modelId in vModel case Revert/simplify a few things Signed-off-by: Nick Hill <nickhill@us.ibm.com>
Update per-model metric changes
@njhill pulled in your changes into this PR |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @VedantMahabaleshwarkar for all of your patience with this!
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: njhill, VedantMahabaleshwarkar The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
- Add `modelId` parameter to `logTimingMetricDuration` function in `Metrics.java`: - `modelmesh_cache_miss_milliseconds` - `modelmesh_loadmodel_milliseconds` - `modelmesh_unloadmodel_milliseconds` - `modelmesh_req_queue_delay_milliseconds` - `modelmesh_model_sizing_milliseconds` - `modelmesh_age_at_eviction_milliseconds` - Add `modelId` parameter to `logSizeEventMetric` function in `Metrics.java`: - `modelmesh_loading_queue_delay_milliseconds` - `modelmesh_loaded_model_size_bytes` - Add `modelId` and `vModelId` param to `logRequestMetrics` in `Metrics.java`: - `modelmesh_invoke_model_milliseconds` - `modelmesh_api_request_milliseconds` Closes red-hat-data-services#60 Signed-off-by: Vedant Mahabaleshwarkar <vmahabal@redhat.com> Signed-off-by: Nick Hill <nickhill@us.ibm.com> Co-authored-by: Prashant Sharma <prashsh1@in.ibm.com> Co-authored-by: Daniele Zonca <dzonca@redhat.com> Co-authored-by: Nick Hill <nickhill@us.ibm.com>
Motivation #90 introduced support for per-model prometheus metrics but the intention was not to change the default behaviour and have this as something enabled explicitly via configuration. However, it was inadvertently made the default. Modifications Change default behaviour to not include modelId/vModelId prometheus metric labels. This is important because model-mesh was designed primarily for use cases where there is a very large and changing number of individual models and those scenarios would result in a much greater number of individual metrics than prometheus can handle. Result Original behaviour restored Signed-off-by: Nick Hill <nickhill@us.ibm.com>
Functionality added in #90 Signed-off-by: Nick Hill <nickhill@us.ibm.com>
Functionality added in #90 --------- Signed-off-by: Nick Hill <nickhill@us.ibm.com>
Motivation #90 introduced support for per-model prometheus metrics but the intention was not to change the default behaviour and have this as something enabled explicitly via configuration. However, it was inadvertently made the default. Modifications Change default behaviour to not include modelId/vModelId prometheus metric labels. This is important because model-mesh was designed primarily for use cases where there is a very large and changing number of individual models and those scenarios would result in a much greater number of individual metrics than prometheus can handle. Result Original behaviour restored Signed-off-by: Nick Hill <nickhill@us.ibm.com>
PR #90 introduced support for per-model prometheus metrics with the intention to not change the default behavior but require this as a feature to be enabled explicitly via configuration. However, it was inadvertently made the default. This commit restores the original behavior by changing the default configuration to not include modelId/vModelId prometheus metric labels because model-mesh was designed primarily for use cases where there is a very large and changing number of individual models and those scenarios would result in a much greater number of individual metrics than prometheus can handle. ------ Signed-off-by: Nick Hill <nickhill@us.ibm.com>
Motivation
#60
Modifications
modelId
parameter tologTimingMetricDuration
function inMetrics.java
modelId
label to the following metrics :modelmesh_cache_miss_milliseconds
,modelmesh_loadmodel_milliseconds
,modelmesh_unloadmodel_milliseconds
,modelmesh_req_queue_delay_milliseconds
,modelmesh_model_sizing_milliseconds
,modelmesh_age_at_eviction_milliseconds
modelId
parameter tologSizeEventMetric
function inMetrics.java
modelId
label to the following metrics :modelmesh_loading_queue_delay_milliseconds
,modelmesh_loaded_model_size_bytes
modelId
andvmodelId
parameter tologRequestMetrics
function inMetrics.java
modelmesh_invoke_model_milliseconds
,modelmesh_api_request_milliseconds
Result
Metrics listed above have the modelid label attached to the emitted metric