-
Notifications
You must be signed in to change notification settings - Fork 118
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pre-built grafana dashboard for modelmesh metrics #335
Comments
@njhill I'm wondering where the one we have in the |
@rafvasq this one is more comprehensive and I think should include most of what's in the other one. If there's anything important in the other one which isn't in this one then it can be added. I am not a fan of the big dials that show current values - they are nice for demos but I don't think very useful in practice. |
#### Motivation To port a previously-built Grafana dashboard developed for displaying various prometheus metrics for ModelMesh and build upon it to include metrics at the deployment/runtime-view #### Modifications - Added `config/dashboard/` directory to host grafana dashboard JSON - Added `servicemonitor.yaml` to `config/prometheus/` - Added and modified `ModelMeshMetricsDashboard.json` to work out-of-the-box - Added queries for deployment views to filter by serving runtime - Created sections for global-view and deployment-view visualizations - Created drop-down variables for filtering between one or many pre-existing runtimes with option to filter by custom serving runtime - Updated `monitoring.md` doc to include guide for setting up Prometheus/Grafana for monitoring; mainly using instructions from [modelmesh-performance](https://github.com/kserve/modelmesh-performance/tree/main/docs/monitoring) #### Result - A functional, ready-to-use dashboard JSON - General documentation outlining out to set up monitoring and use the JSON - Closes #335 Signed-off-by: Rafael Vasquez <raf.vasquez@ibm.com>
#### Motivation To port a previously-built Grafana dashboard developed for displaying various prometheus metrics for ModelMesh and build upon it to include metrics at the deployment/runtime-view #### Modifications - Added `config/dashboard/` directory to host grafana dashboard JSON - Added `servicemonitor.yaml` to `config/prometheus/` - Added and modified `ModelMeshMetricsDashboard.json` to work out-of-the-box - Added queries for deployment views to filter by serving runtime - Created sections for global-view and deployment-view visualizations - Created drop-down variables for filtering between one or many pre-existing runtimes with option to filter by custom serving runtime - Updated `monitoring.md` doc to include guide for setting up Prometheus/Grafana for monitoring; mainly using instructions from [modelmesh-performance](https://github.com/kserve/modelmesh-performance/tree/main/docs/monitoring) #### Result - A functional, ready-to-use dashboard JSON - General documentation outlining out to set up monitoring and use the JSON - Closes kserve#335 Signed-off-by: Rafael Vasquez <raf.vasquez@ibm.com> Signed-off-by: Luis Delossantos <luisgd@ibm.com>
#### Motivation To port a previously-built Grafana dashboard developed for displaying various prometheus metrics for ModelMesh and build upon it to include metrics at the deployment/runtime-view #### Modifications - Added `config/dashboard/` directory to host grafana dashboard JSON - Added `servicemonitor.yaml` to `config/prometheus/` - Added and modified `ModelMeshMetricsDashboard.json` to work out-of-the-box - Added queries for deployment views to filter by serving runtime - Created sections for global-view and deployment-view visualizations - Created drop-down variables for filtering between one or many pre-existing runtimes with option to filter by custom serving runtime - Updated `monitoring.md` doc to include guide for setting up Prometheus/Grafana for monitoring; mainly using instructions from [modelmesh-performance](https://github.com/kserve/modelmesh-performance/tree/main/docs/monitoring) #### Result - A functional, ready-to-use dashboard JSON - General documentation outlining out to set up monitoring and use the JSON - Closes kserve#335 Signed-off-by: Rafael Vasquez <raf.vasquez@ibm.com> Signed-off-by: Luis Delossantos <luisgd@ibm.com>
There is a grafana dashboard we developed for displaying the various prometheus metrics exposed by modelmesh in a clear/useful manner.
However, it was created originally for "standalone" modelmesh use where there's only a single runtime / Deployment per logical modelmesh service. This means it should work well as-is with modelmesh-serving in cases where only one runtime is used, but needs to be tested and probably adjusted/generalized for multi-runtime use.
Potential issues include the fact that the available memory is assumed to be a single pool available to any model, but with multiple runtimes it could in reality be effectively partitions.
Here is the current exported dashboard json: ModelMeshMetricsDashboard.json.gz
The text was updated successfully, but these errors were encountered: