plugins/system/disk.go diskusage queries all devices first, prunes later #440

Millnert · 2015-12-12T12:47:03Z

On systems that have NFS mounts under heavy load, the telegraf process locks up when using the system/disk plugin. Data gathering takes too long.

I have traced the issue to the plugins/system/disk.go code performing some meta command to retrieve disk usage statistics for presumably ALL filesystems first, and then applying the pruning on the retrieved data.

That's precisely the wrong order of operations in order to provide for defensive application of telegraf on loaded systems. I'm no Golang expert, but a quick googling reveals many ways to query file system statistics for a specific mount point.

So I propose instead that the algorithm be updated a bit at https://github.com/influxdb/telegraf/blob/master/plugins/system/disk.go#L30:

get all filesystems to a list
for each filesystem in the list:
2.1 if the filesystem path isn't in the mountpath list from config (if it exists)
2.1.1 get information from this filesystem and append to a list of results with correct tags etc.
for each item in the list of results:
3.1 add the items to the accumulator object
return the accumulator object

Millnert · 2015-12-12T12:56:45Z

I now noticed https://github.com/influxdb/telegraf/blob/master/plugins/system/ps.go#L70 - so perhaps it's this function that should get an optional "mountpoints" list argument instead.

sparrc · 2015-12-12T16:39:59Z

Sounds reasonable to me 👍

fixes #440

sparrc added the bug unexpected problem or unintended behavior label Dec 12, 2015

sparrc added a commit that referenced this issue Jan 20, 2016

Filter mount points before stats are collected

f28a9a8

fixes #440

sparrc mentioned this issue Jan 20, 2016

Filter mount points before stats are collected #558

Merged

sparrc added a commit that referenced this issue Jan 20, 2016

Filter mount points before stats are collected

67bef34

fixes #440

sparrc added a commit that referenced this issue Jan 20, 2016

Filter mount points before stats are collected

fc1aa7d

fixes #440

sparrc closed this as completed in #558 Jan 20, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

plugins/system/disk.go diskusage queries all devices first, prunes later #440

plugins/system/disk.go diskusage queries all devices first, prunes later #440

Millnert commented Dec 12, 2015

Millnert commented Dec 12, 2015

sparrc commented Dec 12, 2015

plugins/system/disk.go diskusage queries all devices first, prunes later #440

plugins/system/disk.go diskusage queries all devices first, prunes later #440

Comments

Millnert commented Dec 12, 2015

Millnert commented Dec 12, 2015

sparrc commented Dec 12, 2015