Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added text collector conversion for ipmitool output. #746

Merged
merged 4 commits into from
Dec 1, 2017

Conversation

derekmarcotte
Copy link
Contributor

Converts output of ipmitool sensor to prometheus format.

e.g.: ipmitool sensor | ./ipmitool > ipmitool.prom

@SuperQ
Copy link
Member

SuperQ commented Nov 28, 2017

Oh, nice. Awk script, bold choice. 😄


# $3 is type field
$3 ~ /degrees C/ {
printf("node_physical_temperature_celcius{sensor=\"%s\"} %f\n", $1, $2);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think node_ipmi_ might be a better choice here.

@SuperQ
Copy link
Member

SuperQ commented Nov 28, 2017

Looks good, my only worry is that the output of ipmitool sensor doesn't sort by device class on some systems.

If the sensor output wasn't sorted,

node_physical_volts{sensor="CPU Vcore"} 1.160000
node_physical_temperature_celcius{sensor="System Temp"} 38.000000
node_physical_volts{sensor="CPU DIMM"} 1.512000

@SuperQ
Copy link
Member

SuperQ commented Nov 28, 2017

Oh, I tested one my systems, it looks like the hex conversion doesn't work right:

PS Status        | 0x1        | discrete   | 0x01ff| na        | na        | na        | na        | na        | na

This returns:

node_physical_status{sensor="PS Status"} 0.000000

@derekmarcotte
Copy link
Contributor Author

I don't understand why the order is important. I might be able to fix it if I do. Wouldn't it be up to the query/report to make sense of the context of the readings?

Are you able to send me your platform details? Seems echo '0x1' | awk '{ print $1+0 }' doesn't hold on GNU Awk 4.0.1, but I want to make sure my fix matches yours also.

@derekmarcotte
Copy link
Contributor Author

Seems --non-decimal-data is the key. I'll update the shebang. It works on both BSD and gawk.

@SuperQ
Copy link
Member

SuperQ commented Nov 29, 2017

Order is important because Prometheus will drop the scrape if metric names come in out of order. It seems like the order is ok for the one system I tested, but it's a concern I have. If you look at a typical metric, we also have the # HELP and # TYPE lines as headers to each metric name.

The platform I tested on was Ubuntu 16.04, GNU Awk 4.1.3, API: 1.1 (GNU MPFR 3.1.4, GNU MP 6.1.0). ipmitool version 1.8.16.

delete a["metric_count"]

if (name != "status") {
printf("# TYPE %s%s gauge\n", namespace, name);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This also needs to print a # HELP metric_name Something.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Happy to add something. Was going to ask about this... Is a comment like , "Readings from the {voltage,temperature,etc} sensors via ipmitool, values are machine-dependent" valuable? (What I would add, if left to my own devices.) I don't think I can interpret very well beyond this. Is it desirable to have a comment that adds no additional information?

Additionally, I'd like to mention about the shebang... I've defaulted it to work in GAWK, because that's the most likely target, but the shebang syntax doesn't work on BSD. It's common to patch this in the ports tree though, and alternate syntax is provided.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Readings from the {voltage,temperature,etc} sensors via ipmitool should be sufficient. We don't really do anything with this data, yet, but having it there is useful for future-proofing.

I think the shebang is fine.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about skipping the type on status? I feel that because there is no linear relationship between possible values, it isn't really a gauge (nor any of the other possible types, of course). Is it simply that the value goes both up and down that makes it a gauge?


printf("# HELP %s%s %s sensor reading from ipmitool\n", namespace, name, friendly[name]);

if (name != "status") {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this not printing the type? The status is a gauge. From a Prometheus perspective, basically anything that isn't a monotonic increasing counter is considered a gauge.

I would remove the if so that you always get a TYPE line.

@SuperQ
Copy link
Member

SuperQ commented Dec 1, 2017

Tested on one of my servers, output looks great.

# HELP node_ipmi_temperature_celcius Temperature sensor reading from ipmitool
# TYPE node_ipmi_temperature_celcius gauge
node_ipmi_temperature_celcius{sensor="System Temp"} 37.000000
# HELP node_ipmi_volts Voltage sensor reading from ipmitool
# TYPE node_ipmi_volts gauge
node_ipmi_volts{sensor="+3.3VSB"} 3.240000
node_ipmi_volts{sensor="+5 V"} 5.088000
node_ipmi_volts{sensor="CPU Mem VTT"} 0.752000
node_ipmi_volts{sensor="CPU DIMM"} 1.512000
node_ipmi_volts{sensor="+3.3 V"} 3.288000
node_ipmi_volts{sensor="+1.8 V"} 1.840000
node_ipmi_volts{sensor="VBAT"} 3.240000
node_ipmi_volts{sensor="+1.1 V"} 1.104000
node_ipmi_volts{sensor="-12 V"} -12.580000
node_ipmi_volts{sensor="CPU Vcore"} 1.184000
node_ipmi_volts{sensor="+12 V"} 11.978000
node_ipmi_volts{sensor="HT Voltage"} 1.184000
# HELP node_ipmi_speed_rpm Fan sensor reading from ipmitool
# TYPE node_ipmi_speed_rpm gauge
node_ipmi_speed_rpm{sensor="FAN 4"} 8281.000000
node_ipmi_speed_rpm{sensor="FAN 1"} 9216.000000
node_ipmi_speed_rpm{sensor="FAN 2"} 8281.000000
node_ipmi_speed_rpm{sensor="FAN 3"} 9216.000000
# HELP node_ipmi_status Chassis status sensor reading from ipmitool
# TYPE node_ipmi_status gauge
node_ipmi_status{sensor="CPU Temp"} 0.000000
node_ipmi_status{sensor="Intrusion"} 1.000000
node_ipmi_status{sensor="PS Status"} 1.000000

Also passes promtool check metrics mode.

Copy link
Member

@SuperQ SuperQ left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@SuperQ SuperQ merged commit 1527789 into prometheus:master Dec 1, 2017
@derekmarcotte
Copy link
Contributor Author

Thanks for the review! 👍

@derekmarcotte derekmarcotte deleted the dm-ipmitool branch December 1, 2017 12:53
oblitorum pushed a commit to shatteredsilicon/node_exporter that referenced this pull request Apr 9, 2024
* Added text collector conversion for ipmitool output.

* Sort metrics before exporting, add namespace.

* Added HELP string, tidy up a bit.

* Make status a gauge.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants