Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support individual model metrics #60

Closed
njhill opened this issue Sep 26, 2022 · 3 comments · Fixed by #90
Closed

Support individual model metrics #60

njhill opened this issue Sep 26, 2022 · 3 comments · Fixed by #90
Assignees
Labels
enhancement New feature or request
Milestone

Comments

@njhill
Copy link
Member

njhill commented Sep 26, 2022

Model-mesh currently exposes a comprehensive set of prometheus metrics, but those associated with a particular model are not currently labeled with the model id. This was an intentional decision since in the use cases model-mesh was designed for, there's a prohibitively large number of models managed, and many of them change frequently. Prometheus guidelines assert that the cardinality of label value permutations across all metrics should be constrained.

However, there are many usecases where the number of models managed is smaller, and for those it can be very useful to monitor metrics at the model level.

We should support this as a configurable option, either globally or per-model.

Some thoughts:

  • This would not apply to all published metrics. Some of them which would be of questionable use are TBD, such as req_queue_delay_milliseconds, age_at_eviction_milliseconds.
  • We could consider less granular labels such as the model type as an (additional) intermediate option. It might be more complicated however to determine the model type in the relevant metric publishing contexts. Also, the way that the type field is currently used within modelmesh-serving would limit the utility.
  • We need to think about what to do in relation to VModels - in many cases the VModel rather than Model would be of more interest. Probably configuration can take a form similar to labels=model,vmodel,type, so that an arbitrary combo of supported labels can be chosen.
@njhill njhill added the enhancement New feature or request label Nov 17, 2022
@njhill
Copy link
Member Author

njhill commented Feb 16, 2023

@ScrapCodes is looking into this

@ckadner ckadner added this to the v0.11.0 milestone Feb 24, 2023
@ScrapCodes ScrapCodes removed their assignment Mar 8, 2023
@ScrapCodes
Copy link
Contributor

/assign @VedantMahabaleshwarkar

@kserve-oss-bot
Copy link
Collaborator

@ScrapCodes: GitHub didn't allow me to assign the following users: VedantMahabaleshwarkar.

Note that only kserve members, repo collaborators and people who have commented on this issue/PR can be assigned. Additionally, issues/PRs can only have 10 assignees at the same time.
For more information please see the contributor guide

In response to this:

/assign @VedantMahabaleshwarkar

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@VedantMahabaleshwarkar VedantMahabaleshwarkar linked a pull request May 2, 2023 that will close this issue
@ckadner ckadner modified the milestones: v0.11.0, v0.11.1 Aug 29, 2023
ckadner pushed a commit that referenced this issue Sep 6, 2023
- Add `modelId` parameter to `logTimingMetricDuration` function in `Metrics.java`:
  - `modelmesh_cache_miss_milliseconds`
  - `modelmesh_loadmodel_milliseconds`
  - `modelmesh_unloadmodel_milliseconds`
  - `modelmesh_req_queue_delay_milliseconds`
  - `modelmesh_model_sizing_milliseconds`
  - `modelmesh_age_at_eviction_milliseconds`
- Add `modelId` parameter to `logSizeEventMetric` function in `Metrics.java`:
  - `modelmesh_loading_queue_delay_milliseconds`
  - `modelmesh_loaded_model_size_bytes`
- Add `modelId` and `vModelId` param to `logRequestMetrics` in `Metrics.java`:
  - `modelmesh_invoke_model_milliseconds`
  - `modelmesh_api_request_milliseconds`

Closes #60

Signed-off-by: Vedant Mahabaleshwarkar <vmahabal@redhat.com>
Signed-off-by: Nick Hill <nickhill@us.ibm.com>
Co-authored-by: Prashant Sharma <prashsh1@in.ibm.com>
Co-authored-by: Daniele Zonca <dzonca@redhat.com>
Co-authored-by: Nick Hill <nickhill@us.ibm.com>
VedantMahabaleshwarkar referenced this issue in VedantMahabaleshwarkar/modelmesh Oct 11, 2023
- Add `modelId` parameter to `logTimingMetricDuration` function in `Metrics.java`:
  - `modelmesh_cache_miss_milliseconds`
  - `modelmesh_loadmodel_milliseconds`
  - `modelmesh_unloadmodel_milliseconds`
  - `modelmesh_req_queue_delay_milliseconds`
  - `modelmesh_model_sizing_milliseconds`
  - `modelmesh_age_at_eviction_milliseconds`
- Add `modelId` parameter to `logSizeEventMetric` function in `Metrics.java`:
  - `modelmesh_loading_queue_delay_milliseconds`
  - `modelmesh_loaded_model_size_bytes`
- Add `modelId` and `vModelId` param to `logRequestMetrics` in `Metrics.java`:
  - `modelmesh_invoke_model_milliseconds`
  - `modelmesh_api_request_milliseconds`

Closes red-hat-data-services#60

Signed-off-by: Vedant Mahabaleshwarkar <vmahabal@redhat.com>
Signed-off-by: Nick Hill <nickhill@us.ibm.com>
Co-authored-by: Prashant Sharma <prashsh1@in.ibm.com>
Co-authored-by: Daniele Zonca <dzonca@redhat.com>
Co-authored-by: Nick Hill <nickhill@us.ibm.com>
ruivieira pushed a commit to ruivieira/modelmesh that referenced this issue Jul 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
5 participants