Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

macOS recursive file monitoring #5575

Merged
merged 5 commits into from
Nov 28, 2017

Conversation

adriansr
Copy link
Contributor

@adriansr adriansr commented Nov 14, 2017

This patch adds a new file monitoring backend to the integrity module, using the FSEvents facility from macOS.

Coalesced events

FSEvents uses a bitmask to notify for more than one event on the same path. For example created|modified|deleted. Ordering information is lost, so it's impossible to tell from
the event itself if the file was modified, then deleted and then re-created, or first created,
then modified and then deleted. The ordering in which the events are stored in Elasticsearch
seems of little importance as they usually get assigned the same timestamp (granularity of one millisecond).

This patch orders the set of actions in a single event to be meaningful depending if the file existed in the beat database and if it exists anymore at the moment of processing the event.

#5421

@adriansr adriansr force-pushed the auditbeat/recursive_macos branch from 0df4724 to 00938b9 Compare November 14, 2017 05:59
@adriansr adriansr requested a review from andrewkroh November 14, 2017 12:59
@adriansr adriansr added the in progress Pull request is currently in progress. label Nov 15, 2017
Copy link
Member

@andrewkroh andrewkroh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested this out on my machine and your approach works well.

However I found that due to the diff'ing Auditbeat does against the last known state that when multiple actions are coalesced (e.g. changed and attributes_modified) that Auditbeat only sends a single event with action=updated. Since the two event are basically clones with different Actions the second event gets discarded. See hasFileChangedSinceLastEvent() for details on what happens when it diffs the events.

You have updated Action to be a bitmask such that it can represent all the changes that have occurred. So instead of pushing one event for each individual Action from fsreader I say we send one event with multiple Actions. I think the only time we need two events is when the file has been deleted and then recreated. Your logic for getting the ordering correct between these two events seems good.

Then in the events that Auditbeat outputs the action field should be changed to be an array so that we can say, for example, that a file was updated and its attributes were modified or a file was updated and deleted.

Since we are changing the action field I think diffEvents needs to be updated for consistency. Currently it only reports a single Action. Say for example that a file is updated and chmod'ed while Auditbeat is stopped. When Auditbeat restarts the scanner should generate an event with action: [updated, attributes_modified].

And I think we should make the change to always use FSEvent on darwin (for recursive and non-recursive). When in non-recursive mode the implementation will need to drop events for files that aren't in the configured dir.

// into a single notification. This might be useful to reduce load in
// some scenarios so that it deserves its own configuration flag.
Latency: 0,
Flags: fsevents.FileEvents | fsevents.WatchRoot,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should IgnoreSelf be added to prevent Auditbeat from getting into some kind of feedback loop (like if someone decided to monitor /var/lib/auditbeat where the database is maintained)?

func (r *fsreader) Start(done <-chan struct{}) (<-chan Event, error) {
r.stream.Start()
go r.consumeEvents(done)
logp.Info("%v started fsevents recursive watched", logPrefix)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/watched/watch/?

@adriansr adriansr force-pushed the auditbeat/recursive_macos branch from 00938b9 to 0cb6575 Compare November 23, 2017 01:17

func (action Action) isMultiple() bool {
// checks if there is more than one bit set
return action != 0 && (action&(action-1)) != 0
Copy link
Member

@andrewkroh andrewkroh Nov 23, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a bits.OnesCount8 function that is supposed to be optimized.

if err != nil {
return 0, errors.Wrap(err, fmt.Sprintf("failed to stat"))
}
meta, err := NewMetadata(path, info)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that you need only the inode and file type I would use the os.FileInfo directly to avoid some of the extra work that NewMetadata does (like lookup username and group).

@@ -19,6 +20,10 @@ type reader struct {

// NewEventReader creates a new EventProducer backed by fsnotify.
func NewEventReader(c Config) (EventProducer, error) {

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please drop this newline.

@@ -211,7 +180,7 @@ func (e *Event) String() string {
return string(data)
}

func buildMapStr(e *Event) common.MapStr {
func buildMapStr(e *Event, existsBefore, existsNow bool) common.MapStr {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could existsNow be determined by looking at e? Then you wouldn’t need the third param.

This patch adds a new file monitoring to the integrity module that
uses the FSEvents facility from macOS. Uses fsnotify's implementation
of fsevents library. Non-recursive behavior is emulated by filtering events.
@adriansr adriansr force-pushed the auditbeat/recursive_macos branch from 5706b78 to 93462ea Compare November 25, 2017 13:54
@andrewkroh
Copy link
Member

jenkins, test it

@andrewkroh
Copy link
Member

This looks pretty complete. @adriansr, should the "in progress" be removed?

@andrewkroh
Copy link
Member

andrewkroh commented Nov 27, 2017

The way Action is persisted is going to need to be updated. Or perhaps there is no use in persisting this value?

switch e.Action {
case AttributesModified:
schema.EventAddAction(b, schema.ActionAttributesModified)
case Created:
schema.EventAddAction(b, schema.ActionCreated)
case Deleted:
schema.EventAddAction(b, schema.ActionDeleted)
case Updated:
schema.EventAddAction(b, schema.ActionUpdated)
case Moved:
schema.EventAddAction(b, schema.ActionMoved)
case ConfigChange:
schema.EventAddAction(b, schema.ActionConfigChanged)
}

It could be defined as enum Action : ubyte (bit_flags) { in the schema and the compiler will generate values based on power of two.

@adriansr
Copy link
Contributor Author

Ok I updated the flatbuffers to persist the action as a bitmask, although the field is not used.

@andrewkroh andrewkroh merged commit 37bfd85 into elastic:master Nov 28, 2017
@monicasarbu monicasarbu removed the in progress Pull request is currently in progress. label Dec 4, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants