Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Priority plugin: Expose data via prometheus exporter #456

Closed
QuantumDancer opened this issue Sep 19, 2023 · 0 comments · Fixed by #471
Closed

Priority plugin: Expose data via prometheus exporter #456

QuantumDancer opened this issue Sep 19, 2023 · 0 comments · Fixed by #471
Assignees

Comments

@QuantumDancer
Copy link
Contributor

QuantumDancer commented Sep 19, 2023

We currently export the amount of resource usage and updated priority via a simple command to a text file:

group_mapping:
  group1:
    - "slurm_partition_group1"
  group2:
    - "slurm_partition_group2"
commands:
  - '/usr/bin/bash -c "/usr/bin/echo \"$(date --rfc-3339=sec --utc) | {resource} | {priority}\" >> {group}.txt"'

It would be nice to have this information also available as live data in our Grafana monitoring.
Therefore, this data should also be exposed via a Prometheus exporter.

Config

I envision a config like this

prometheus:
  enable: true
  addr: 0.0.0.0
  port: 8000
  metrics:
    - ResourceUsage
    - Priority
  • enable: true or false - enable or disable Prometheus exporter
  • addr: Address of the Prometheus exporter. In most cases, this should be either 0.0.0.0, localhost, or the IP of the machine where this is running on
  • port: Port of the Prometheus exporter
  • metrics: A list of metrics to expose.
    • ResourceUsage: corresponding to resource in the bash command above
    • Priority: priority of the group

The metrics should be export per group that is configured.

Internals

Right now the priority plugin is designed to be executed in e.g. a cronjob. Because we want a Prometheus exporter running the whole time, we need to change the plugin so that it can be run as a service. Therefore, we need to provide a way to periodically call the existing functionality of the priority plugin.

Add something like a frequency to the config, that specifies when to update the resource usage and priority.

frequency: 3600

After each update, store the resource usage and priority somewhere so that it can be used for the Prometheus exporter.

Other

Up to now the Auditor host and port are set like this in the config:

addr: "10.18.0.12"
port: 8001

To avoid confusion with the host address and port settings for the Prometheus exporter, it might be better to group them into a separate section, i.e.

auditor:
  addr: "10.18.0.12"
  port: 8001
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants