Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RAM monitoring apparently has a hole #675

Open
pboothe opened this issue May 11, 2020 · 2 comments
Open

RAM monitoring apparently has a hole #675

pboothe opened this issue May 11, 2020 · 2 comments

Comments

@pboothe
Copy link
Contributor

pboothe commented May 11, 2020

The hole should be discovered and plugged.

From https://github.com/m-lab/ops-tracker/issues/1085

@stephen-soltesz
Copy link
Contributor

Two possible strategies:

  • can we poll the DRAC for this information?
  • can we change kernel configuration to look in the “right” place so node-exporter reports the metrics we need?

@nkinkade
Copy link
Contributor

nkinkade commented Jun 8, 2020

Maybe we need something like an idrac_exporter? Or something that feels like banging a round peg into a square hole would be to try to piggy back on the reboot-service, which already logs into every DRAC periodically. Seems like not a huge stretch to think it could login, like it already does, and then also check for reported errors.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants