-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Auditbeat: initial_scan
action for new paths
#7954
Conversation
A bit more detail: The new logic follows Option 2 laid out in a comment in the original issue. Before the scanner is called, the Metricset checks which paths from the config are not contained in the db and are therefore assumed to be new. It passes this list to the scanner which allows it to correctly mark new dirs and files it sees as either |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll try to take a deeper look at this today. Thanks for opening a PR!
@@ -177,6 +179,32 @@ func (ms *MetricSet) init(reporter mb.PushReporterV2) bool { | |||
return true | |||
} | |||
|
|||
// Determine which - if any - paths have been newly added to the config |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comments are great. I just want to encourage you to use the godoc format (even for private methods) which is to start the sentence with the name of the method (e.g. findNewPaths finds the new paths...
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the pointer, I fixed it here and will try to remember going forward.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just one minor comment, otherwise LGTM.
} | ||
|
||
// NewFileSystemScanner creates a new EventProducer instance that scans the | ||
// configured file paths. | ||
func NewFileSystemScanner(c Config) (EventProducer, error) { | ||
func NewFileSystemScanner(c Config, newP map[string]bool) (EventProducer, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you update the function's comment to reflect the meaning of this parameter and maybe give it a more descriptive name?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
There's an issue with the CLA related to the e-mail address you used for this commits. I guess it can be fixed by associating this e-mail address with your Github account |
Thanks, fixed it. Test failed in CI that didn't fail locally, I'm not sure why. Looking into it. |
Found it: When the test runs, there's a race about whether the changes are found by the scanner or by filesystem notification (both are started at the same time). The scanner usually wins, and the actions it sets is what the test was originally written for. The filesystem reports slightly different actions (e.g. It's a bit of a shame though that the reported actions are not deterministic, and maybe this is something that we can normalize in the future. I've pushed a commit, let's hope the CI is good with it. It's a non-deterministic failure, so it's a bit tricky. I tested locally with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the solution you reached here. I left a few comments.
@@ -177,6 +179,32 @@ func (ms *MetricSet) init(reporter mb.PushReporterV2) bool { | |||
return true | |||
} | |||
|
|||
// findNewPaths determines which - if any - paths have been newly added to the config |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a double space between paths have
. And it should have a period to end the sentence.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, fixed.
config Config | ||
log *logp.Logger | ||
config Config | ||
newPaths map[string]bool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When using a map
as a set data structure you can use map[string]struct{}
. To me this is more clear because it's saying that you don't care about the value and that when you use map you will only be checking whether the key exists (rather than that the key exists and its value is true
). Also it may save a few bytes.
m := map[string]struct{}{} // Init.
m[name] = struct{} // Add.
if _, exists := m[name]; exists { // Contains.
// Exists!
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh nice, I like that it takes 0 bytes - done.
continue | ||
} | ||
|
||
lastEvent, err := load(ms.bucket, evalPath) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like nice clear solution to the problem. 👍
@@ -47,6 +48,7 @@ var actionNames = map[Action]string{ | |||
Updated: "updated", | |||
Moved: "moved", | |||
ConfigChange: "config_change", | |||
Found: "found", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So this flag indicates that the event was generated due to the initial scan? If so how about we call it initial_scan
rather than found
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense - changed it.
9db5594
to
2cb655c
Compare
found
action for new pathsinitial_scan
action for new paths
Thanks @andrewkroh - all implemented. Travis is unhappy, but it doesn't look related. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
9a647ef
to
a1b5e8a
Compare
@cwurm A few notes on your commit message:
|
When `scan_at_start: true` the filesystem scanner will look through all paths and record any new or modified files and directories it finds. This commit introduces a new action `initial_scan` that the scanner uses when it scans a path for the first time. Previously, the scanner would report everything under a new path as `created`, making it impossible to distinguish between files and directories that are truly new (e.g. have been added during a restart of Auditbeat) and those that are merely new because they are under a newly added path in the configuration. Resolves elastic#7821
a1b5e8a
to
de677c8
Compare
Thanks @andrewkroh - I've changed it to follow the format and made it more descriptive. Let me know if anything is still missing. |
Introduces a new
found
action that is used by the scanner on startup (whenscan_at_start: true
) to mark files found in configured paths that it has not seen before.Addresses #7821