-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix heartbeat races on event updates #6950
Conversation
bd65c6f
to
67e542e
Compare
event = common.MapStr{} | ||
event.DeepUpdate(fields) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reading this with the else if
I asked myself what the else
case in the end which is not covered here actually means and if it can happen. So event is nil and err is nil?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A runner can return an event, but must not. It's fully optional. We have intermediate runners (resolve IP address), that do not return an event, but create a set of continuation runners. So event
can be nil.
If a runner returns an error, no followup runners are created. => We need to create an error event. On error a runner can optionally return an event. The WithFields runner enforces the creation of an event on error, such that the already known fields (like IP resolve results) an can be merged into the final event
The WithFields wrapper adds shared fields only if an event will be returned.
To be more verbose, the conditions is:
if err != nil {
if event == nil {
event = common.MapStr{}
} else {
event = event.Clone()
}
event.DeepUpdate(fields)
} else { // no error
if event != nil {
event = event.Clone() // we need to copy the event to ensure no error are returned.
event.DeepUpdate(fields)
}
}
Due to the else if
, the second else branch is equivalent to event == nil && err != nil
.
The case event == nil && err == nil
is perfectly valid -> we just don't report an event.
I added the needs_backport and review label. |
@urso Can you rebase this on master. This should fix the LS CI failure. |
Heartbeat has a many components, each adding fields to an existing event. As events are created per 'TaskRun', this is normally fine. But in case the TaskRunner branches of into multiple Sub-tasks, we will see races on shared event structures. This is the case with multiple ports in the http/tcp configuration or the IPAll setting (ping all known IPs of a given domain). Namespaces in an event can be potentially populated via globally shared structures. This can also lead to unwanted updates or races. This fix clones the event structure before continuing event updates, so to remove the chance of races.
67e542e
to
38c9c4a
Compare
@ruflin rebased. |
Heartbeat has a many components, each adding fields to an existing event. As events are created per 'TaskRun', this is normally fine. But in case the TaskRunner branches of into multiple Sub-tasks, we will see races on shared event structures. This is the case with multiple ports in the http/tcp configuration or the IPAll setting (ping all known IPs of a given domain). Namespaces in an event can be potentially populated via globally shared structures. This can also lead to unwanted updates or races. This fix clones the event structure before continuing event updates, so to remove the chance of races. (cherry picked from commit e92ed3e)
Heartbeat has a many components, each adding fields to an existing event. As events are created per 'TaskRun', this is normally fine. But in case the TaskRunner branches of into multiple Sub-tasks, we will see races on shared event structures. This is the case with multiple ports in the http/tcp configuration or the IPAll setting (ping all known IPs of a given domain). Namespaces in an event can be potentially populated via globally shared structures. This can also lead to unwanted updates or races. This fix clones the event structure before continuing event updates, so to remove the chance of races. (cherry picked from commit e92ed3e)
Heartbeat has a many components, each adding fields to an existing event. As events are created per 'TaskRun', this is normally fine. But in case the TaskRunner branches of into multiple Sub-tasks, we will see races on shared event structures. This is the case with multiple ports in the http/tcp configuration or the IPAll setting (ping all known IPs of a given domain). Namespaces in an event can be potentially populated via globally shared structures. This can also lead to unwanted updates or races. This fix clones the event structure before continuing event updates, so to remove the chance of races. (cherry picked from commit e92ed3e)
Heartbeat has a many components, each adding fields to an existing event. As events are created per 'TaskRun', this is normally fine. But in case the TaskRunner branches of into multiple Sub-tasks, we will see races on shared event structures. This is the case with multiple ports in the http/tcp configuration or the IPAll setting (ping all known IPs of a given domain). Namespaces in an event can be potentially populated via globally shared structures. This can also lead to unwanted updates or races. This fix clones the event structure before continuing event updates, so to remove the chance of races. (cherry picked from commit e92ed3e)
Heartbeat has a many components, each adding fields to an existing event. As events are created per 'TaskRun', this is normally fine. But in case the TaskRunner branches of into multiple Sub-tasks, we will see races on shared event structures. This is the case with multiple ports in the http/tcp configuration or the IPAll setting (ping all known IPs of a given domain). Namespaces in an event can be potentially populated via globally shared structures. This can also lead to unwanted updates or races. This fix clones the event structure before continuing event updates, so to remove the chance of races. (cherry picked from commit bc225fb)
Heartbeat has a many components, each adding fields to an existing event. As events are created per 'TaskRun', this is normally fine. But in case the TaskRunner branches of into multiple Sub-tasks, we will see races on shared event structures. This is the case with multiple ports in the http/tcp configuration or the IPAll setting (ping all known IPs of a given domain). Namespaces in an event can be potentially populated via globally shared structures. This can also lead to unwanted updates or races. This fix clones the event structure before continuing event updates, so to remove the chance of races. (cherry picked from commit bc225fb)
Resolves: #6616
Heartbeat has a many components, each adding fields to an existing
event. As events are created per 'TaskRun', this is normally fine. But
in case the TaskRunner branches of into multiple Sub-tasks, we will see
races on shared event structures. This is the case with multiple ports
in the http/tcp configuration or the IPAll setting (ping all known IPs
of a given domain).
Namespaces in an event can be potentially populated via globally shared
structures. This can also lead to unwanted updates or races.
This fix clones the event structure before continuing event updates, so
to remove the chance of races.
I did run some tests before and after the change, with the race detector enabled. No more race have been found with this change.