-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature Request] Allow excluding certain hwmon data sources #2681
Comments
Yeah I think include/exclude flags like in other collectors would be reasonable to add for this. |
Feel free to submit a PR. Have a look at the include/exclude flag implementation in other collectors for guidance |
@xen0n, if you don't want to implement this, i'm happy to take a look at this either |
Thanks for offering help! Indeed it may take a long time for me to implement this myself: for one thing I'm way too occupied already (I take on many upstream roles and that's not counting the day job), and for another thing I'm not very familiar with |
Happy to be corrected, but since the include/exclude flags are a suggested way to implement this, probably doesn't need a design. I'll take a look at making a PR |
I was initially referring to the decision one has to make regarding which kind of identifier the include/exclude patterns are going to match against, e.g. IDs like |
This issue can be closed out now. The next release will include #2699 |
Thanks very much for your contribution! However I just noticed the README changes and a corresponding CHANGELOG entry weren't added along with the PR. I'll add them shortly. |
Host operating system
Linux lily 6.3.0-next-20230505-14814-g94feb5819da7 #1 SMP PREEMPT Tue Aug 30 11:11:44 AM CST 2022 loongarch64 GNU/Linux
This behavior is not tied to architecture quirks though, even with this exotic arch.
node_exporter version:
node_exporter command line flags
/usr/sbin/node_exporter --collector.textfile.directory=/var/lib/node_exporter/
node_exporter log output
(not really applicable, current behavior is expected)
Are you running node_exporter in Docker?
No
What did you do that produced an error?
Just running
node_exporter
as a regular systemd service without any additional config.What did you expect to see?
hwmon data collection wouldn't pull from an
amdgpu
source that's runtime suspended in D3hot state, or at least could be configured to exclude this source.What did you see instead?
Each metrics pull wakes up the GPU that's only going back to sleep because there's no monitor attached, flooding the dmesg log. (Each wakeup produces 25 lines of amdgpu log, and in my case Prometheus pulls every 8~10 seconds, just enough for the GPU to go back to sleep again.) See also the finding on amd-gfx and dri-devel.
Right now without modifying sources I can only either (1) disable the hwmon collector altogether, or (2) modprobe amdgpu with
runpm=0
that's going to waste some power. If this certain use case (runningnode_exporter
on machines with GPUs but seldom used via GUI) is worthwhile to support, it could be useful to allow excluding certain hwmon sources based on name (e.g.amdgpu
) or device ID (e.g.hwmon1
).Adding awareness for devices' power states could be useful too, but as not all device classes provide such information in their sysfs entries, this might or might not be easy. PCI devices all have
power_state
though, and we could choose to skip collection if power state is not D0 i.e. fully powered on.The text was updated successfully, but these errors were encountered: