-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Metricbeat: Remove duplicated filesystems in system module #6819
Conversation
|
||
// If a block device is mounted multiple times (e.g. with bind mounts), | ||
// store it only once, and use the shorter mount point path. | ||
if seen, found := devices[fs.DevName]; found { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not completely accurate, there can be different device paths that are really pointing to the same block device, but I don't believe this is common at all. To control this we should check duplicates by device identifier (Major:Minor
) but this is not supported in gosigar yet. Does it worth to implement this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would leave it what you have for now but potentially follow up with a change in gosigar to make it possible if it is a case that can happen (more then once :-)). @andrewkroh WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it will help make Metricbeat more accurate then I'm +1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will take a look to this in gosigar, meanwhile I think we are good enough with the checks based on device name.
b304e19
to
9c8173b
Compare
Please don't merge this, I have just seen that gosigar obtains list of filesystem from mtab, what doesn't include the device name for bind mounts. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the idea of having some defaults for filesystem.ignore_types
. Someone could still set it to filesystem.ignore_types: [ ]
if we wants all types.
|
||
// If a block device is mounted multiple times (e.g. with bind mounts), | ||
// store it only once, and use the shorter mount point path. | ||
if seen, found := devices[fs.DevName]; found { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would leave it what you have for now but potentially follow up with a change in gosigar to make it possible if it is a case that can happen (more then once :-)). @andrewkroh WDYT?
}, | ||
}, | ||
{ | ||
description: "Don't repeat devices, sortest of dir names should be used", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: shortest, also in all the ones below
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
:D
On linux we could maybe ignore by default all types marked as |
@jsoriano I'm ok with excluding |
9c8173b
to
cf2673f
Compare
Two more changes added:
|
Testing it in a linux machine it looks quite better now with the default settings. |
@@ -32,6 +34,13 @@ func New(base mb.BaseMetricSet) (mb.MetricSet, error) { | |||
return nil, err | |||
} | |||
|
|||
if len(config.IgnoreTypes) == 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The behaviour I would have expected that if a user sets filesystem.ignore_types: []
that then no file systems would be ignore. But checking for 0 indicates that it also applies in this case. Could we instead check if the config is set?
If we go with the above suggestion, the reference config file should be adjusted to contain the two values.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, right, checking nil instead of zero lenght. I have added a comment also in reference config.
cf2673f
to
fe6c17c
Compare
|
||
// If the device name is a directory, this is a bind mount or nullfs, | ||
// don't count it as it'd be counting again its parent filesystem. | ||
devFileInfo, err := os.Stat(fs.DevName) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One of the original reasons we added ignore_types
was to avoid doing a statfs
on autofs filesystems. Metricbeat was causing autofs mounts to never be unmounted because its continuous metric gathering was making autofs think the mount was being accessed.
Will this cause the same problem?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point.
Here the stat is done against the device name, on #4823 the mentioned issued seemed to appear when doing statfs on the mount point (DirName
). So I don't think it can cause the same problem.
fe6c17c
to
5427e7a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
jenkins, test it |
5427e7a
to
a0acdd5
Compare
@jsoriano I think the failing windows tests are related to this change. |
a0acdd5
to
e608ad9
Compare
Removing filtering on Windows |
Something went wrong in the helper:
|
e608ad9
to
fb6a7bd
Compare
jenkins, test again please |
Mountpoints whose device name is the same absolute path are considered to be the same and are counted only once. The directory name used is the shorter between them. This avoids duplication of filesystems as can happen with bind mounts in Linux (elastic#3384).
In system module, when no filesystem type is set to be ignored, it tries to make a sane guess. In systems with /proc/filesystems file it ignores all devices marked as `nodev`.
fb6a7bd
to
0222403
Compare
Mountpoints whose device name is the same absolute path are considered to be the same and are counted only once. The directory name used is the shorter between them. If the device of a filesystem is a directory, it is considered some kind of bind mount and is ignored. This avoids duplication of filesystems as can happen with bind mounts in Linux (#3384). This mimics the behaviour of
df
command.There are still filesystems that can be counted twice if union filesystems are used (as aufs or overlay), but this can be controlled with the
filesystem.ignore_types
option.Maybe we should blacklist them by default.Update: For that, if the option is left empty, we fill the list of ignored types with allnodev
devices in/proc/filesystems
where this file exists.