Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve prospector state handling #2840

Merged
merged 1 commit into from
Oct 31, 2016
Merged

Conversation

ruflin
Copy link
Contributor

@ruflin ruflin commented Oct 25, 2016

Previously each prospector was holding old states initially loaded by the registry file. This was changed to that each prospector only loads the states with a path that matches the glob pattern. This reduces the number of states handled by each prospector and reduces overlap.

One consequence of this is that in case a log file "moves" from one prospector to an other through renaming during the runtime of filebeat, no state will be found. But this behaviour is not expected and would lead to other issues already now. The expectation is that a file stays inside the prospector over its full lifetime.

In case of a filebeat restart including a config change, it can happen that some states are not managed anymore by a prospector. These "obsolete" states will stay in the registrar until a further config change when a glob pattern would again include these states. As an example:

Filebeat is started with the following config.

filebeat.prospectors:
- paths: ['*.log']
  clean_removed: true

There is a log file a.log and b.log. Both files are harvested and states for a.log and b.log are persisted. Then filebeat is stopped and the config file is modified to.

filebeat.prospectors:
- paths: ['a.log']
  clean_removed: true

Filebeat is started again. The prospector will now only handle the state for a.log. In case b.log is removed from disk, the state for b.log will stay in the registry file.

In case the config file was with clean_inactive: 5min, all TTL are reset on restart to -2 by the registry and -1 the prospector or the new clean_inactive value. Using -2 in the registrar can be useful in the future to detect which states are not managed anymore by any prospector. As all TTL are reset on restart, persisting of the TTL is not required anymore but can become useful for cleanup so the information is kept in the registry file.

Further changes:

  • Add tests for matchFile method
  • Add tests for prospector state init filtering
  • states are passed to Prospector.Init on startup instead of setting it directly in the object
  • Prospector.Init and Prospectorer.Init now return an error and filebeat exits on startup problems
  • Remove lastClean as not needed anymore
  • Have one log file for each bom file test
  • Update to new vagrant box with Golang 1.7.3

@urso
Copy link

urso commented Oct 25, 2016

How about a dummy prospector handling files not handled by any prospector. This dummy prospector can check mod-time from time to time if file was updated at all and therefore update the TTL in registry. This way a file would be kept in registry between restarts in case file is not handled by any prospector due to accidental miss-configuration.

@ruflin
Copy link
Contributor Author

ruflin commented Oct 27, 2016

After some more internal conversations I try to summarise here the different options with its pros and cons:

Option 1: Keep obsolete states forever

For states which will not be picked up by a prospector, TTL is set to -1 which means these states are kept forever:

Pro

  • Old states are kept around. In case of misconfiguration and second restart of filebeat, states will still be there
  • Config changes with obsolete states mainly happening during test, probably rare during production

Con

  • Inode reuse can happen on restart with old state, but only if old and new file match the same prospector glob
  • clean_removed and clean_inactive both don't work on obsolete states

Option 2: Keep TTL for obsolete states

Obsolete states will be removed after their TTL from the registry file.

Pro:

  • Inactive states will still be removed after restart

Con:

  • This could lead to accidental removal of files
  • clean_removed still does not apply

Option 3: hidden prospector

A hidden prospector takes care of the obsolete states.

Pro

  • clean_removed still works

Con

  • Complexity
  • Additional config needed for clean_inactive for obsolete states?

Conclusion

Currently I would opt for option 1 as it seems to me the easiest to understand and the one with the least side affects. In case in the future there are problems with inode reuse there are potential solutions like have a flag on startup to clean up obslete states. But as inode reuse is normally local to a prospector, I don't expect this to happen.

@ruflin ruflin force-pushed the ttl-improvements branch 6 times, most recently from 7f90bde to 9f9fd99 Compare October 28, 2016 13:18
@ruflin ruflin removed discuss Issue needs further discussion. in progress Pull request is currently in progress. labels Oct 28, 2016
@ruflin ruflin force-pushed the ttl-improvements branch 3 times, most recently from 82879d2 to 33e2a8c Compare October 31, 2016 11:34
Previously each prospector was holding old states initially loaded by the registry file. This was changed to that each prospector only loads the states with a path that matches the glob pattern. This reduces the number of states handled by each prospector and reduces overlap.

One consequence of this is that in case a log file "moves" from one prospector to an other through renaming during the runtime of filebeat, no state will be found. But this behaviour is not expected and would lead to other issues already now. The expectation is that a file stays inside the prospector over its full lifetime.

In case of a filebeat restart including a config change, it can happen that some states are not managed anymore by a prospector. These "obsolete" states will stay in the registrar until a further config change when a glob pattern would again include these states. As an example:

Filebeat is started with the following config.

```
filebeat.prospectors:
- paths: ['*.log']
  clean_removed: true
```

There is a log file `a.log` and `b.log`. Both files are harvested and states for `a.log` and `b.log` are persisted. Then filebeat is stopped and the config file is modified to.

```
filebeat.prospectors:
- paths: ['a.log']
  clean_removed: true
```

Filebeat is started again. The prospector will now only handle the state for `a.log`. In case `b.log` is removed from disk, the state for `b.log` will stay in the registry file.

In case the config file was with `clean_inactive: 5min`, all TTL are reset on restart to `-2` by the registry and `-1` the prospector or the new `clean_inactive` value. Using `-2` in the registrar can be useful in the future to detect which states are not managed anymore by any prospector. As all TTL are reset on restart, persisting of the TTL is not required anymore but can become useful for cleanup so the information is kept in the registry file.

Further changes:

* Add tests for matchFile method
* Add tests for prospector state init filtering
* states are passed to `Prospector.Init` on startup instead of setting it directly in the object
* `Prospector.Init` and `Prospectorer.Init` now return an error and filebeat exits on startup problems
* Remove lastClean as not needed anymore
* Have one log file for each bom file test
* Update to new vagrant box with Golang 1.7.3
@tsg
Copy link
Contributor

tsg commented Oct 31, 2016

LGTM.

@tsg tsg merged commit 470d8b1 into elastic:master Oct 31, 2016
@ruflin ruflin deleted the ttl-improvements branch November 14, 2016 09:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants