Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Collect InfiniBand port state and physical state #1357

Merged
merged 1 commit into from
Nov 22, 2019

Conversation

bdrung
Copy link
Contributor

@bdrung bdrung commented May 28, 2019

Parsing the sysfs files for InfiniBand should be added to the procfs library (see prometheus/procfs#164). Then also collect the InfiniBand port state, the physical state, and the maximum signal transfer rate.

bdrung added a commit to bdrung/node_exporter that referenced this pull request Jun 25, 2019
procfs v0.0.3 is a requirement for
prometheus#1357

Signed-off-by: Benjamin Drung <benjamin.drung@cloud.ionos.com>
SuperQ pushed a commit that referenced this pull request Jun 25, 2019
procfs v0.0.3 is a requirement for
#1357

Signed-off-by: Benjamin Drung <benjamin.drung@cloud.ionos.com>
bdrung added a commit to bdrung/node_exporter that referenced this pull request Jul 1, 2019
procfs v0.0.4-0.20190627154503-39e1aff1547e is a requirement for
prometheus#1357 (because procfs
v0.0.3 contained bug prometheus/procfs#187)

Signed-off-by: Benjamin Drung <benjamin.drung@cloud.ionos.com>
SuperQ pushed a commit that referenced this pull request Jul 1, 2019
procfs v0.0.4-0.20190627154503-39e1aff1547e is a requirement for
#1357 (because procfs
v0.0.3 contained bug prometheus/procfs#187)

Signed-off-by: Benjamin Drung <benjamin.drung@cloud.ionos.com>
@pgier
Copy link
Contributor

pgier commented Aug 28, 2019

I think this one can be closed since it is the same as #1396

@bdrung
Copy link
Contributor Author

bdrung commented Aug 28, 2019

This pull request contains two more commits that #1396 did not.

@bdrung bdrung changed the title Use InfiniBandClass() from procfs library and collect InfiniBand port state and physical state Collect InfiniBand port state and physical state Aug 28, 2019
@bdrung
Copy link
Contributor Author

bdrung commented Aug 28, 2019

This pull request depends on #1396. I adjusted the pull request title to make the difference clearer.

@discordianfish
Copy link
Member

Just merged #1396, can you rebase this?

@bdrung
Copy link
Contributor Author

bdrung commented Sep 24, 2019

Rebased. Now it is a simple 6 line change.

collector/infiniband_linux.go Outdated Show resolved Hide resolved
@bdrung
Copy link
Contributor Author

bdrung commented Oct 15, 2019

I could add following to the physical_state_id description:

(0: no change, 1: sleep, 2: polling, 3: disable, 4: shift, 5: link up, 6: link error recover, 7: phytest)

and to state_id:

(0: no change, 1: down, 2: init, 3: armed, 4: active, 5: act defer)

Do you like it or would it make the description line too long?

@discordianfish
Copy link
Member

@bdrung Yeah I think that would be helpful

Collect the InfiniBand port state, the physical state, and the maximum
signal transfer rate.

Signed-off-by: Benjamin Drung <benjamin.drung@cloud.ionos.com>
@bdrung bdrung force-pushed the master branch 2 times, most recently from 2fefe92 to ad61351 Compare October 29, 2019 09:51
@bdrung
Copy link
Contributor Author

bdrung commented Oct 29, 2019

Extended the description line.

@discordianfish
Copy link
Member

@SuperQ @pgier Final review and merge?

Copy link
Member

@SuperQ SuperQ left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@SuperQ SuperQ merged commit 04fbcff into prometheus:master Nov 22, 2019
SuperQ added a commit that referenced this pull request May 25, 2020
* The netdev collector CLI argument `--collector.netdev.ignored-devices` was renamed to `--collector.netdev.device-blacklist` in order to conform with the systemd collector. #1279
* The label named `state` on `node_systemd_service_restart_total` metrics was changed to `name` to better describe the metric. #1393
* Refactoring of the mdadm collector changes several metrics
    - `node_md_disks_active` is removed
    - `node_md_disks` now has a `state` label for "fail", "spare", "active" disks.
    - `node_md_is_active` is replaced by `node_md_state` with a state set of "active", "inactive", "recovering", "resync".
* Additional label `mountaddr` added to NFS device metrics to distinguish mounts from the same URL, but different IP addresses. #1417
* Metrics node_cpu_scaling_frequency_min_hrts and node_cpu_scaling_frequency_max_hrts of the cpufreq collector were renamed to node_cpu_scaling_frequency_min_hertz and node_cpu_scaling_frequency_max_hertz. #1510
* Collectors that are enabled, but are unable to find data to collect, now return 0 for `node_scrape_collector_success`.

* [CHANGE] Add `--collector.netdev.device-whitelist`. #1279
* [CHANGE] Ignore iso9600 filesystem on Linux #1355
* [CHANGE] Refactor mdadm collector #1403
* [CHANGE] Add `mountaddr` label to NFS metrics. #1417
* [CHANGE] Don't count empty collectors as success. #1613
* [FEATURE] New flag to disable default collectors #1276
* [FEATURE] Add experimental TLS support #1277, #1687, #1695
* [FEATURE] Add collector for Power Supply Class #1280
* [FEATURE] Add new schedstat collector #1389
* [FEATURE] Add FreeBSD zfs support #1394
* [FEATURE] Add uname support for Darwin and OpenBSD #1433
* [FEATURE] Add new metric node_cpu_info #1489
* [FEATURE] Add new thermal_zone collector #1425
* [FEATURE] Add new cooling_device metrics to thermal zone collector #1445
* [FEATURE] Add swap usage on darwin #1508
* [FEATURE] Add Btrfs collector #1512
* [FEATURE] Add RAPL collector #1523
* [FEATURE] Add new softnet collector #1576
* [FEATURE] Add new udp_queues collector #1503
* [FEATURE] Add basic authentication #1673
* [ENHANCEMENT] Log pid when there is a problem reading the process stats #1341
* [ENHANCEMENT] Collect InfiniBand port state and physical state #1357
* [ENHANCEMENT] Include additional XFS runtime statistics. #1423
* [ENHANCEMENT] Report non-fatal collection errors in the exporter metric. #1439
* [ENHANCEMENT] Expose IPVS firewall mark as a label #1455
* [ENHANCEMENT] Add check for systemd version before attempting to query certain metrics. #1413
* [ENHANCEMENT] Add a flag to adjust mount timeout #1486
* [ENHANCEMENT] Add new counters for flush requests in Linux 5.5 #1548
* [ENHANCEMENT] Add metrics and tests for UDP receive and send buffer errors #1534
* [ENHANCEMENT] The sockstat collector now exposes IPv6 statistics in addition to the existing IPv4 support. #1552
* [ENHANCEMENT] Add infiniband info metric #1563
* [ENHANCEMENT] Add unix socket support for supervisord collector #1592
* [ENHANCEMENT] Implement loadavg on all BSDs without cgo #1584
* [ENHANCEMENT] Add model_name and stepping to node_cpu_info metric #1617
* [ENHANCEMENT] Add `--collector.perf.cpus` to allow setting the CPU list for perf stats. #1561
* [ENHANCEMENT] Add metrics for IO errors and retires on Darwin. #1636
* [ENHANCEMENT] Add perf tracepoint collection flag #1664
* [ENHANCEMENT] ZFS: read contents of objset file #1632
* [ENHANCEMENT] Linux CPU: Cache CPU metrics to make them monotonically increasing #1711
* [BUGFIX] Read /proc/net files with a single read syscall #1380
* [BUGFIX] Renamed label `state` to `name` on `node_systemd_service_restart_total`. #1393
* [BUGFIX] Fix netdev nil reference on Darwin #1414
* [BUGFIX] Strip path.rootfs from mountpoint labels #1421
* [BUGFIX] Fix seconds reported by schedstat #1426
* [BUGFIX] Fix empty string in path.rootfs #1464
* [BUGFIX] Fix typo in cpufreq metric names #1510
* [BUGFIX] Read /proc/stat in one syscall #1538
* [BUGFIX] Fix OpenBSD cache memory information #1542
* [BUGFIX] Refactor textfile collector to avoid looping defer #1549
* [BUGFIX] Fix network speed math #1580
* [BUGFIX] collector/systemd: use regexp to extract systemd version #1647
* [BUGFIX] Fix initialization in perf collector when using multiple CPUs #1665
* [BUGFIX] Fix accidentally empty lines in meminfo_linux #1671

Signed-off-by: Ben Kochie <superq@gmail.com>
@SuperQ SuperQ mentioned this pull request May 25, 2020
oblitorum pushed a commit to shatteredsilicon/node_exporter that referenced this pull request Apr 9, 2024
procfs v0.0.3 is a requirement for
prometheus#1357

Signed-off-by: Benjamin Drung <benjamin.drung@cloud.ionos.com>
oblitorum pushed a commit to shatteredsilicon/node_exporter that referenced this pull request Apr 9, 2024
procfs v0.0.4-0.20190627154503-39e1aff1547e is a requirement for
prometheus#1357 (because procfs
v0.0.3 contained bug prometheus/procfs#187)

Signed-off-by: Benjamin Drung <benjamin.drung@cloud.ionos.com>
oblitorum pushed a commit to shatteredsilicon/node_exporter that referenced this pull request Apr 9, 2024
Collect the InfiniBand port state, the physical state, and the maximum
signal transfer rate.

Signed-off-by: Benjamin Drung <benjamin.drung@cloud.ionos.com>
oblitorum pushed a commit to shatteredsilicon/node_exporter that referenced this pull request Apr 9, 2024
* The netdev collector CLI argument `--collector.netdev.ignored-devices` was renamed to `--collector.netdev.device-blacklist` in order to conform with the systemd collector. prometheus#1279
* The label named `state` on `node_systemd_service_restart_total` metrics was changed to `name` to better describe the metric. prometheus#1393
* Refactoring of the mdadm collector changes several metrics
    - `node_md_disks_active` is removed
    - `node_md_disks` now has a `state` label for "fail", "spare", "active" disks.
    - `node_md_is_active` is replaced by `node_md_state` with a state set of "active", "inactive", "recovering", "resync".
* Additional label `mountaddr` added to NFS device metrics to distinguish mounts from the same URL, but different IP addresses. prometheus#1417
* Metrics node_cpu_scaling_frequency_min_hrts and node_cpu_scaling_frequency_max_hrts of the cpufreq collector were renamed to node_cpu_scaling_frequency_min_hertz and node_cpu_scaling_frequency_max_hertz. prometheus#1510
* Collectors that are enabled, but are unable to find data to collect, now return 0 for `node_scrape_collector_success`.

* [CHANGE] Add `--collector.netdev.device-whitelist`. prometheus#1279
* [CHANGE] Ignore iso9600 filesystem on Linux prometheus#1355
* [CHANGE] Refactor mdadm collector prometheus#1403
* [CHANGE] Add `mountaddr` label to NFS metrics. prometheus#1417
* [CHANGE] Don't count empty collectors as success. prometheus#1613
* [FEATURE] New flag to disable default collectors prometheus#1276
* [FEATURE] Add experimental TLS support prometheus#1277, prometheus#1687, prometheus#1695
* [FEATURE] Add collector for Power Supply Class prometheus#1280
* [FEATURE] Add new schedstat collector prometheus#1389
* [FEATURE] Add FreeBSD zfs support prometheus#1394
* [FEATURE] Add uname support for Darwin and OpenBSD prometheus#1433
* [FEATURE] Add new metric node_cpu_info prometheus#1489
* [FEATURE] Add new thermal_zone collector prometheus#1425
* [FEATURE] Add new cooling_device metrics to thermal zone collector prometheus#1445
* [FEATURE] Add swap usage on darwin prometheus#1508
* [FEATURE] Add Btrfs collector prometheus#1512
* [FEATURE] Add RAPL collector prometheus#1523
* [FEATURE] Add new softnet collector prometheus#1576
* [FEATURE] Add new udp_queues collector prometheus#1503
* [FEATURE] Add basic authentication prometheus#1673
* [ENHANCEMENT] Log pid when there is a problem reading the process stats prometheus#1341
* [ENHANCEMENT] Collect InfiniBand port state and physical state prometheus#1357
* [ENHANCEMENT] Include additional XFS runtime statistics. prometheus#1423
* [ENHANCEMENT] Report non-fatal collection errors in the exporter metric. prometheus#1439
* [ENHANCEMENT] Expose IPVS firewall mark as a label prometheus#1455
* [ENHANCEMENT] Add check for systemd version before attempting to query certain metrics. prometheus#1413
* [ENHANCEMENT] Add a flag to adjust mount timeout prometheus#1486
* [ENHANCEMENT] Add new counters for flush requests in Linux 5.5 prometheus#1548
* [ENHANCEMENT] Add metrics and tests for UDP receive and send buffer errors prometheus#1534
* [ENHANCEMENT] The sockstat collector now exposes IPv6 statistics in addition to the existing IPv4 support. prometheus#1552
* [ENHANCEMENT] Add infiniband info metric prometheus#1563
* [ENHANCEMENT] Add unix socket support for supervisord collector prometheus#1592
* [ENHANCEMENT] Implement loadavg on all BSDs without cgo prometheus#1584
* [ENHANCEMENT] Add model_name and stepping to node_cpu_info metric prometheus#1617
* [ENHANCEMENT] Add `--collector.perf.cpus` to allow setting the CPU list for perf stats. prometheus#1561
* [ENHANCEMENT] Add metrics for IO errors and retires on Darwin. prometheus#1636
* [ENHANCEMENT] Add perf tracepoint collection flag prometheus#1664
* [ENHANCEMENT] ZFS: read contents of objset file prometheus#1632
* [ENHANCEMENT] Linux CPU: Cache CPU metrics to make them monotonically increasing prometheus#1711
* [BUGFIX] Read /proc/net files with a single read syscall prometheus#1380
* [BUGFIX] Renamed label `state` to `name` on `node_systemd_service_restart_total`. prometheus#1393
* [BUGFIX] Fix netdev nil reference on Darwin prometheus#1414
* [BUGFIX] Strip path.rootfs from mountpoint labels prometheus#1421
* [BUGFIX] Fix seconds reported by schedstat prometheus#1426
* [BUGFIX] Fix empty string in path.rootfs prometheus#1464
* [BUGFIX] Fix typo in cpufreq metric names prometheus#1510
* [BUGFIX] Read /proc/stat in one syscall prometheus#1538
* [BUGFIX] Fix OpenBSD cache memory information prometheus#1542
* [BUGFIX] Refactor textfile collector to avoid looping defer prometheus#1549
* [BUGFIX] Fix network speed math prometheus#1580
* [BUGFIX] collector/systemd: use regexp to extract systemd version prometheus#1647
* [BUGFIX] Fix initialization in perf collector when using multiple CPUs prometheus#1665
* [BUGFIX] Fix accidentally empty lines in meminfo_linux prometheus#1671

Signed-off-by: Ben Kochie <superq@gmail.com>
oblitorum pushed a commit to shatteredsilicon/node_exporter that referenced this pull request Apr 9, 2024
procfs v0.0.3 is a requirement for
prometheus#1357

Signed-off-by: Benjamin Drung <benjamin.drung@cloud.ionos.com>
oblitorum pushed a commit to shatteredsilicon/node_exporter that referenced this pull request Apr 9, 2024
procfs v0.0.4-0.20190627154503-39e1aff1547e is a requirement for
prometheus#1357 (because procfs
v0.0.3 contained bug prometheus/procfs#187)

Signed-off-by: Benjamin Drung <benjamin.drung@cloud.ionos.com>
oblitorum pushed a commit to shatteredsilicon/node_exporter that referenced this pull request Apr 9, 2024
Collect the InfiniBand port state, the physical state, and the maximum
signal transfer rate.

Signed-off-by: Benjamin Drung <benjamin.drung@cloud.ionos.com>
oblitorum pushed a commit to shatteredsilicon/node_exporter that referenced this pull request Apr 9, 2024
* The netdev collector CLI argument `--collector.netdev.ignored-devices` was renamed to `--collector.netdev.device-blacklist` in order to conform with the systemd collector. prometheus#1279
* The label named `state` on `node_systemd_service_restart_total` metrics was changed to `name` to better describe the metric. prometheus#1393
* Refactoring of the mdadm collector changes several metrics
    - `node_md_disks_active` is removed
    - `node_md_disks` now has a `state` label for "fail", "spare", "active" disks.
    - `node_md_is_active` is replaced by `node_md_state` with a state set of "active", "inactive", "recovering", "resync".
* Additional label `mountaddr` added to NFS device metrics to distinguish mounts from the same URL, but different IP addresses. prometheus#1417
* Metrics node_cpu_scaling_frequency_min_hrts and node_cpu_scaling_frequency_max_hrts of the cpufreq collector were renamed to node_cpu_scaling_frequency_min_hertz and node_cpu_scaling_frequency_max_hertz. prometheus#1510
* Collectors that are enabled, but are unable to find data to collect, now return 0 for `node_scrape_collector_success`.

* [CHANGE] Add `--collector.netdev.device-whitelist`. prometheus#1279
* [CHANGE] Ignore iso9600 filesystem on Linux prometheus#1355
* [CHANGE] Refactor mdadm collector prometheus#1403
* [CHANGE] Add `mountaddr` label to NFS metrics. prometheus#1417
* [CHANGE] Don't count empty collectors as success. prometheus#1613
* [FEATURE] New flag to disable default collectors prometheus#1276
* [FEATURE] Add experimental TLS support prometheus#1277, prometheus#1687, prometheus#1695
* [FEATURE] Add collector for Power Supply Class prometheus#1280
* [FEATURE] Add new schedstat collector prometheus#1389
* [FEATURE] Add FreeBSD zfs support prometheus#1394
* [FEATURE] Add uname support for Darwin and OpenBSD prometheus#1433
* [FEATURE] Add new metric node_cpu_info prometheus#1489
* [FEATURE] Add new thermal_zone collector prometheus#1425
* [FEATURE] Add new cooling_device metrics to thermal zone collector prometheus#1445
* [FEATURE] Add swap usage on darwin prometheus#1508
* [FEATURE] Add Btrfs collector prometheus#1512
* [FEATURE] Add RAPL collector prometheus#1523
* [FEATURE] Add new softnet collector prometheus#1576
* [FEATURE] Add new udp_queues collector prometheus#1503
* [FEATURE] Add basic authentication prometheus#1673
* [ENHANCEMENT] Log pid when there is a problem reading the process stats prometheus#1341
* [ENHANCEMENT] Collect InfiniBand port state and physical state prometheus#1357
* [ENHANCEMENT] Include additional XFS runtime statistics. prometheus#1423
* [ENHANCEMENT] Report non-fatal collection errors in the exporter metric. prometheus#1439
* [ENHANCEMENT] Expose IPVS firewall mark as a label prometheus#1455
* [ENHANCEMENT] Add check for systemd version before attempting to query certain metrics. prometheus#1413
* [ENHANCEMENT] Add a flag to adjust mount timeout prometheus#1486
* [ENHANCEMENT] Add new counters for flush requests in Linux 5.5 prometheus#1548
* [ENHANCEMENT] Add metrics and tests for UDP receive and send buffer errors prometheus#1534
* [ENHANCEMENT] The sockstat collector now exposes IPv6 statistics in addition to the existing IPv4 support. prometheus#1552
* [ENHANCEMENT] Add infiniband info metric prometheus#1563
* [ENHANCEMENT] Add unix socket support for supervisord collector prometheus#1592
* [ENHANCEMENT] Implement loadavg on all BSDs without cgo prometheus#1584
* [ENHANCEMENT] Add model_name and stepping to node_cpu_info metric prometheus#1617
* [ENHANCEMENT] Add `--collector.perf.cpus` to allow setting the CPU list for perf stats. prometheus#1561
* [ENHANCEMENT] Add metrics for IO errors and retires on Darwin. prometheus#1636
* [ENHANCEMENT] Add perf tracepoint collection flag prometheus#1664
* [ENHANCEMENT] ZFS: read contents of objset file prometheus#1632
* [ENHANCEMENT] Linux CPU: Cache CPU metrics to make them monotonically increasing prometheus#1711
* [BUGFIX] Read /proc/net files with a single read syscall prometheus#1380
* [BUGFIX] Renamed label `state` to `name` on `node_systemd_service_restart_total`. prometheus#1393
* [BUGFIX] Fix netdev nil reference on Darwin prometheus#1414
* [BUGFIX] Strip path.rootfs from mountpoint labels prometheus#1421
* [BUGFIX] Fix seconds reported by schedstat prometheus#1426
* [BUGFIX] Fix empty string in path.rootfs prometheus#1464
* [BUGFIX] Fix typo in cpufreq metric names prometheus#1510
* [BUGFIX] Read /proc/stat in one syscall prometheus#1538
* [BUGFIX] Fix OpenBSD cache memory information prometheus#1542
* [BUGFIX] Refactor textfile collector to avoid looping defer prometheus#1549
* [BUGFIX] Fix network speed math prometheus#1580
* [BUGFIX] collector/systemd: use regexp to extract systemd version prometheus#1647
* [BUGFIX] Fix initialization in perf collector when using multiple CPUs prometheus#1665
* [BUGFIX] Fix accidentally empty lines in meminfo_linux prometheus#1671

Signed-off-by: Ben Kochie <superq@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants