Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pre-built grafana dashboard for modelmesh metrics #335

Closed
njhill opened this issue Feb 27, 2023 · 2 comments · Fixed by #346
Closed

Pre-built grafana dashboard for modelmesh metrics #335

njhill opened this issue Feb 27, 2023 · 2 comments · Fixed by #346
Assignees
Labels
enhancement New feature or request
Milestone

Comments

@njhill
Copy link
Member

njhill commented Feb 27, 2023

There is a grafana dashboard we developed for displaying the various prometheus metrics exposed by modelmesh in a clear/useful manner.

However, it was created originally for "standalone" modelmesh use where there's only a single runtime / Deployment per logical modelmesh service. This means it should work well as-is with modelmesh-serving in cases where only one runtime is used, but needs to be tested and probably adjusted/generalized for multi-runtime use.

Potential issues include the fact that the available memory is assumed to be a single pool available to any model, but with multiple runtimes it could in reality be effectively partitions.

Here is the current exported dashboard json: ModelMeshMetricsDashboard.json.gz

@rafvasq
Copy link
Member

rafvasq commented Feb 27, 2023

@njhill I'm wondering where the one we have in the modelmesh-performance repo fits in to this. It includes some views like resource usage per runtime container. Would the resulting dashboard perhaps be a combination of the one you shared and the modelmesh-performance one?

@rafvasq rafvasq added the enhancement New feature or request label Feb 27, 2023
@njhill
Copy link
Member Author

njhill commented Mar 1, 2023

@rafvasq this one is more comprehensive and I think should include most of what's in the other one. If there's anything important in the other one which isn't in this one then it can be added. I am not a fan of the big dials that show current values - they are nice for demos but I don't think very useful in practice.

@rafvasq rafvasq self-assigned this Mar 14, 2023
@ckadner ckadner added this to the v0.11.0 milestone Apr 14, 2023
kserve-oss-bot pushed a commit that referenced this issue May 27, 2023
#### Motivation
To port a previously-built Grafana dashboard developed for displaying various prometheus metrics for ModelMesh and build upon it to include metrics at the deployment/runtime-view

#### Modifications
- Added `config/dashboard/` directory to host grafana dashboard JSON
- Added `servicemonitor.yaml` to `config/prometheus/`
- Added and modified `ModelMeshMetricsDashboard.json` to work out-of-the-box 
  - Added queries for deployment views to filter by serving runtime
  - Created sections for global-view and deployment-view visualizations
  - Created drop-down variables for filtering between one or many pre-existing runtimes with option to filter by custom serving runtime
- Updated `monitoring.md` doc to include guide for setting up Prometheus/Grafana for monitoring; mainly using instructions from [modelmesh-performance](https://github.com/kserve/modelmesh-performance/tree/main/docs/monitoring)

#### Result
- A functional, ready-to-use dashboard JSON
- General documentation outlining out to set up monitoring and use the JSON
- Closes #335 

Signed-off-by: Rafael Vasquez <raf.vasquez@ibm.com>
lgdeloss pushed a commit to lgdeloss/modelmesh-serving that referenced this issue Jun 5, 2023
#### Motivation
To port a previously-built Grafana dashboard developed for displaying various prometheus metrics for ModelMesh and build upon it to include metrics at the deployment/runtime-view

#### Modifications
- Added `config/dashboard/` directory to host grafana dashboard JSON
- Added `servicemonitor.yaml` to `config/prometheus/`
- Added and modified `ModelMeshMetricsDashboard.json` to work out-of-the-box
  - Added queries for deployment views to filter by serving runtime
  - Created sections for global-view and deployment-view visualizations
  - Created drop-down variables for filtering between one or many pre-existing runtimes with option to filter by custom serving runtime
- Updated `monitoring.md` doc to include guide for setting up Prometheus/Grafana for monitoring; mainly using instructions from [modelmesh-performance](https://github.com/kserve/modelmesh-performance/tree/main/docs/monitoring)

#### Result
- A functional, ready-to-use dashboard JSON
- General documentation outlining out to set up monitoring and use the JSON
- Closes kserve#335

Signed-off-by: Rafael Vasquez <raf.vasquez@ibm.com>
Signed-off-by: Luis Delossantos <luisgd@ibm.com>
lgdeloss pushed a commit to lgdeloss/modelmesh-serving that referenced this issue Jun 6, 2023
#### Motivation
To port a previously-built Grafana dashboard developed for displaying various prometheus metrics for ModelMesh and build upon it to include metrics at the deployment/runtime-view

#### Modifications
- Added `config/dashboard/` directory to host grafana dashboard JSON
- Added `servicemonitor.yaml` to `config/prometheus/`
- Added and modified `ModelMeshMetricsDashboard.json` to work out-of-the-box
  - Added queries for deployment views to filter by serving runtime
  - Created sections for global-view and deployment-view visualizations
  - Created drop-down variables for filtering between one or many pre-existing runtimes with option to filter by custom serving runtime
- Updated `monitoring.md` doc to include guide for setting up Prometheus/Grafana for monitoring; mainly using instructions from [modelmesh-performance](https://github.com/kserve/modelmesh-performance/tree/main/docs/monitoring)

#### Result
- A functional, ready-to-use dashboard JSON
- General documentation outlining out to set up monitoring and use the JSON
- Closes kserve#335

Signed-off-by: Rafael Vasquez <raf.vasquez@ibm.com>
Signed-off-by: Luis Delossantos <luisgd@ibm.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants