Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix concurrent harvesters #2541

Merged
merged 2 commits into from
Sep 14, 2016
Merged

Conversation

ruflin
Copy link
Member

@ruflin ruflin commented Sep 14, 2016

In case newly started harvesters did not persist their first state before the next scan started, it could have happened that multiple harvesters were started for the same file. This could have been cause by a large number of files or the output blocking.

The problem is solve that the Setup step of the Harvester is now synchronus and blocking the scan. Part of this is also updating the first state of the as part of the prospector.

The side affect of this change is that now a scan is blocking in case the channel is blocked which means the output is probably not responding. If the output is not responding, scans will not continue and new files will not be discovered until output is available again.

The code can be further simplified in the future by merging create/startHarvester. This will be done in a second step to keep backport commit to a minimum.

See also #2539

ok := p.outlet.OnEvent(event)
if !ok {
logp.Info("Prospector outlet closed")
return
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

change in logic here. This return used to stop this worker. The return having moved to updateState will not stop the worker potentially publishing more incoming events. The worker instead should be dropped + report to prospector it needs to 'shutdown' + drain the harvesterChan, so no harvester will be blocked.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added an bool to updateState to be returned so the loop can check if event was sent and return accordingly. Same is needed for other calls of updateState.

@urso
Copy link

urso commented Sep 14, 2016

LGTM

In case newly started harvesters did not persist their first state before the next scan started, it could have happened that multiple harvesters were started for the same file. This could have been cause by a large number of files or the output blocking.

The problem is solve that the Setup step of the Harvester is now synchronus and blocking the scan. Part of this is also updating the first state of the as part of the prospector.

The side affect of this change is that now a scan is blocking in case the channel is blocked which means the output is probably not responding. If the output is not responding, scans will not continue and new files will not be discovered until output is available again.

The code can be further simplified in the future by merging create/startHarvester. This will be done in a second step to keep backport commit to a minimum.

See also elastic#2539
@tsg tsg merged commit dc80b9c into elastic:master Sep 14, 2016
@ruflin ruflin deleted the paralallel-harvesters-fix branch September 14, 2016 17:41
@tsg tsg removed the needs_backport PR is waiting to be backported to other branches. label Sep 22, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants