Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add StorCli text collector example script #320

Merged
merged 2 commits into from
Dec 26, 2016

Conversation

mattbostock
Copy link
Contributor

Collect metrics from the StorCLI utility on the health of MegaRAID
hardware RAID controllers and write them to stdout so that they can be
used by the textfile collector.

We parse the JSON output that StorCLI provides.

Script must be run as root or with appropriate capabilities for storcli
to access the RAID card.

Designed to run under Python 2.7, using the system Python provided with
many Linux distributions.

The metrics look like this:

mbostock@host:~$ sudo ./storcli.py
megaraid_status_code 0
megaraid_controllers_count 1
megaraid_emergency_hot_spare{controller="0", model="AVAGOMegaRAIDSASPCIExpressROMB"} 1
megaraid_scheduled_patrol_read{controller="0", model="AVAGOMegaRAIDSASPCIExpressROMB"} 1
megaraid_virtual_drives{controller="0", model="AVAGOMegaRAIDSASPCIExpressROMB"} 1
megaraid_drive_groups{controller="0", model="AVAGOMegaRAIDSASPCIExpressROMB"} 1
megaraid_virtual_drives_optimal{controller="0", model="AVAGOMegaRAIDSASPCIExpressROMB"} 1
megaraid_degraded{controller="0", model="AVAGOMegaRAIDSASPCIExpressROMB"} 0
megaraid_battery_backup_healthy{controller="0", model="AVAGOMegaRAIDSASPCIExpressROMB"} 1
megaraid_ports{controller="0", model="AVAGOMegaRAIDSASPCIExpressROMB"} 8
megaraid_failed{controller="0", model="AVAGOMegaRAIDSASPCIExpressROMB"} 0
megaraid_drive_groups_optimal{controller="0", model="AVAGOMegaRAIDSASPCIExpressROMB"} 1
megaraid_healthy{controller="0", model="AVAGOMegaRAIDSASPCIExpressROMB"} 1
megaraid_physical_drives{controller="0", model="AVAGOMegaRAIDSASPCIExpressROMB"} 24
mbostock@host:~$

I don't code regularly in Python so any suggestions for improvements welcome.

@mattbostock mattbostock changed the title Add StorClI text collector example script Add StorCli text collector example script Oct 4, 2016


def get_store_cli_json():
storcli_cmd = ['/opt/MegaRAID/storcli/storcli64', 'show', 'all', 'J']
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This path needs to be configurable

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left it hardcoded as I figured someone using this would likely copy and adapt it (as an example), but agree it's nicer to make it configurable. Would you prefer a commandline flag?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, a binary-path flag would be great.

}

for name, value in controller_metrics.iteritems():
print("{}{}{} {}".format(METRIC_PREFIX, name, labels, value))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is only writign to stdout, it should take care of creating the file and atomically changing it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is intentional as I didn't want the script to have knowledge of where it should put the metrics, since that information will likely be repeated in each different script it seems better to put that logic in the cronjob or whatever calls the script.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, I think we should have a standard interface for all of these scripts. I also would prefer stdout and not have to implement atomic writing in every script, and have a simple atomic write wrapper that can be used to put the metrics files in the correct place.

@brian-brazil
Copy link
Contributor

Also model doesn't belong as a label, the controller number is sufficient to identify the controller. I'd suggest using the machine role approach for it.

@mattbostock
Copy link
Contributor Author

I hesitated to rely on the controller number as I didn't know how stable it is. Happy to remove the model label.

@brian-brazil
Copy link
Contributor

If it's not stable, the model likely doesn't help you as the chances are you have identical models in a given machine.

@SuperQ
Copy link
Member

SuperQ commented Oct 5, 2016

I would recommend a megaraid_controller_info metric with the controller and model labels. This allows for annotation without having the model label on every metric.

@SuperQ
Copy link
Member

SuperQ commented Nov 27, 2016

Ping, any progress on finishing this?

@mattbostock
Copy link
Contributor Author

Sorry for the delay, will get to this soon.

Copy link
Member

@discordianfish discordianfish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please address comments

Collect metrics from the StorCLI utility on the health of MegaRAID
hardware RAID controllers and write them to stdout so that they can be
used by the textfile collector.

We parse the JSON output that StorCLI provides.

Script must be run as root or with appropriate capabilities for storcli
to access the RAID card.

Designed to run under Python 2.7, using the system Python provided with
many Linux distributions.

The metrics look like this:

    mbostock@host:~$ sudo ./storcli.py
    megaraid_status_code 0
    megaraid_controllers_count 1
    megaraid_emergency_hot_spare{controller="0"} 1
    megaraid_scheduled_patrol_read{controller="0"} 1
    megaraid_virtual_drives{controller="0"} 1
    megaraid_drive_groups{controller="0"} 1
    megaraid_virtual_drives_optimal{controller="0"} 1
    megaraid_degraded{controller="0"} 0
    megaraid_battery_backup_healthy{controller="0"} 1
    megaraid_ports{controller="0"} 8
    megaraid_failed{controller="0"} 0
    megaraid_drive_groups_optimal{controller="0"} 1
    megaraid_healthy{controller="0"} 1
    megaraid_physical_drives{controller="0"} 24
    megaraid_controller_info{controller="0", model="AVAGOMegaRAIDSASPCIExpressROMB"} 1
    mbostock@host:~$
@mattbostock
Copy link
Contributor Author

@discordianfish @brian-brazil @SuperQ: I've amended to address the comments in this PR:

  • added a --storcli_path option to set the path to storcli
  • moved the model label to a megaraid_controller_info metric

I've also added --help and --version flags, and changed the logic so that the script fails more gracefully on machines where no MegaRAID cards are installed.

Any suggestions for changes welcome.

Copy link
Member

@discordianfish discordianfish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, LGMT!

@discordianfish discordianfish merged commit ad1befe into prometheus:master Dec 26, 2016
@SuperQ SuperQ mentioned this pull request Jan 15, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants