Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix race in monitoring output #8646

Merged
merged 3 commits into from
Oct 19, 2018
Merged

Conversation

urso
Copy link

@urso urso commented Oct 18, 2018

The action part of the bulk request used to be a constant (used to use
'var' for init purposes only), but with 6.4 we have had to introduce the
index._type field per monitoring type.
With the introduction of telemetry we ended up with 2 publisher
pipelines and 2 outputs, each with slightly different parameters, yet
the action part has become a 'global' shared variable, that was modified
by each reporter. If metrics and telemetry publish events at the same
time, there is a race in modifying the global and serializing the bulk
request.

This change removes the global and creates an action item per event,
such that there will be no sharing at all.

The action part of the bulk request used to be a constant (used to use
'var' for init purposes only), but with 6.4 we have had to introduce the
`index._type` field per monitoring type.
With the introduction of telemetry we ended up with 2 publisher
pipelines and 2 outputs, each with slightly different parameters, yet
the action part has become a 'global' shared variable, that was modified
by each reporter. If metrics and telemetry publish events at the same
time, there is a race in modifying the global and serializing the bulk
request.

This change removes the global and creates an action item per event,
such that there will be no sharing at all.
@urso urso added bug libbeat needs_backport PR is waiting to be backported to other branches. labels Oct 18, 2018
@urso urso requested a review from ruflin October 18, 2018 12:48
Copy link
Contributor

@ph ph left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, good catch.

We have fixed a few issues recently with shared map or values, I will pay more attention in reviews for that specific case. With the addition of more processors with caching in beats we have to take extra care.

@ph
Copy link
Contributor

ph commented Oct 18, 2018

make check is failing on that PR.

Copy link
Contributor

@ph ph left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Failure is legit.

Copy link
Contributor

@ruflin ruflin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code LGTM but probably a naming conflict?

Copy link
Contributor

@ruflin ruflin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WFG

Copy link
Contributor

@ruflin ruflin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the changelog, forgot about it.

@urso urso merged commit 1438000 into elastic:master Oct 19, 2018
@urso urso added v6.5.0 and removed needs_backport PR is waiting to be backported to other branches. labels Oct 19, 2018
urso pushed a commit to urso/beats that referenced this pull request Oct 19, 2018
The action part of the bulk request used to be a constant (used to use
'var' for init purposes only), but with 6.4 we have had to introduce the
`index._type` field per monitoring type.
With the introduction of telemetry we ended up with 2 publisher
pipelines and 2 outputs, each with slightly different parameters, yet
the action part has become a 'global' shared variable, that was modified
by each reporter. If metrics and telemetry publish events at the same
time, there is a race in modifying the global and serializing the bulk
request.

This change removes the global and creates an action item per event,
such that there will be no sharing at all.

(cherry picked from commit 1438000)
@urso urso added the v6.4.3 label Oct 19, 2018
urso pushed a commit to urso/beats that referenced this pull request Oct 19, 2018
The action part of the bulk request used to be a constant (used to use
'var' for init purposes only), but with 6.4 we have had to introduce the
`index._type` field per monitoring type.
With the introduction of telemetry we ended up with 2 publisher
pipelines and 2 outputs, each with slightly different parameters, yet
the action part has become a 'global' shared variable, that was modified
by each reporter. If metrics and telemetry publish events at the same
time, there is a race in modifying the global and serializing the bulk
request.

This change removes the global and creates an action item per event,
such that there will be no sharing at all.

(cherry picked from commit 1438000)
urso pushed a commit that referenced this pull request Oct 19, 2018
Cherry-pick of PR #8646 to 6.x branch. Original message: 

The action part of the bulk request used to be a constant (used to use
'var' for init purposes only), but with 6.4 we have had to introduce the
`index._type` field per monitoring type.
With the introduction of telemetry we ended up with 2 publisher
pipelines and 2 outputs, each with slightly different parameters, yet
the action part has become a 'global' shared variable, that was modified
by each reporter. If metrics and telemetry publish events at the same
time, there is a race in modifying the global and serializing the bulk
request.

This change removes the global and creates an action item per event,
such that there will be no sharing at all.
urso pushed a commit that referenced this pull request Oct 19, 2018
Cherry-pick of PR #8646 to 6.4 branch. Original message: 

The action part of the bulk request used to be a constant (used to use
'var' for init purposes only), but with 6.4 we have had to introduce the
`index._type` field per monitoring type.
With the introduction of telemetry we ended up with 2 publisher
pipelines and 2 outputs, each with slightly different parameters, yet
the action part has become a 'global' shared variable, that was modified
by each reporter. If metrics and telemetry publish events at the same
time, there is a race in modifying the global and serializing the bulk
request.

This change removes the global and creates an action item per event,
such that there will be no sharing at all.
@urso urso deleted the monitoring-race-global branch February 19, 2019 18:55
leweafan pushed a commit to leweafan/beats that referenced this pull request Apr 28, 2023
…ic#8659)

Cherry-pick of PR elastic#8646 to 6.4 branch. Original message: 

The action part of the bulk request used to be a constant (used to use
'var' for init purposes only), but with 6.4 we have had to introduce the
`index._type` field per monitoring type.
With the introduction of telemetry we ended up with 2 publisher
pipelines and 2 outputs, each with slightly different parameters, yet
the action part has become a 'global' shared variable, that was modified
by each reporter. If metrics and telemetry publish events at the same
time, there is a race in modifying the global and serializing the bulk
request.

This change removes the global and creates an action item per event,
such that there will be no sharing at all.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants