Skip to content

Commit

Permalink
fix: restart the merge controllers on conflict
Browse files Browse the repository at this point in the history
Fixes #3861

What this change effectively does is that it changes immediate reconcile
request to an error return, so that controller will be restarted with a
backoff.

More details:

* root cause of the update/teardown conflict is that the finalizer is
still pending on the tearing down resource
* finalizer might not be removed immediately, e.g. if the controller
which put the finalizer is itself in the crash loop
* if the merge controller queues reconcile immediately, it restarts
itself, but the finalizer is still there, so it once again goes into
reconcile loop and that goes forever until the finalizer is removed, so
instead if the controller fails, it will be restarted with exponential
backoff lowering the load on the system

Change is validated with the unit-tests reproducing the conflict.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
  • Loading branch information
smira authored and talos-bot committed Jun 30, 2021
1 parent 60d7360 commit 7f8e50d
Show file tree
Hide file tree
Showing 3 changed files with 27 additions and 12 deletions.
13 changes: 9 additions & 4 deletions internal/app/machined/pkg/controllers/network/address_merge.go
Original file line number Diff line number Diff line change
Expand Up @@ -86,6 +86,8 @@ func (ctrl *AddressMergeController) Run(ctx context.Context, r controller.Runtim
addresses[id] = address
}

conflictsDetected := 0

for id, address := range addresses {
address := address

Expand All @@ -97,12 +99,11 @@ func (ctrl *AddressMergeController) Run(ctx context.Context, r controller.Runtim
return nil
}); err != nil {
if state.IsPhaseConflictError(err) {
logger.Debug("conflict detected", zap.String("id", id))
// phase conflict, resource is being torn down, skip updating it and trigger reconcile
// later by failing the
conflictsDetected++

delete(addresses, id)

// trigger another reconcile
r.QueueReconcile()
} else {
return fmt.Errorf("error updating resource: %w", err)
}
Expand Down Expand Up @@ -131,5 +132,9 @@ func (ctrl *AddressMergeController) Run(ctx context.Context, r controller.Runtim
}
}
}

if conflictsDetected > 0 {
return fmt.Errorf("%d conflict(s) detected", conflictsDetected)
}
}
}
13 changes: 9 additions & 4 deletions internal/app/machined/pkg/controllers/network/link_merge.go
Original file line number Diff line number Diff line change
Expand Up @@ -100,6 +100,8 @@ func (ctrl *LinkMergeController) Run(ctx context.Context, r controller.Runtime,
}
}

conflictsDetected := 0

for id, link := range links {
link := link

Expand All @@ -111,12 +113,11 @@ func (ctrl *LinkMergeController) Run(ctx context.Context, r controller.Runtime,
return nil
}); err != nil {
if state.IsPhaseConflictError(err) {
logger.Debug("conflict detected", zap.String("id", id))
// phase conflict, resource is being torn down, skip updating it and trigger reconcile
// later by failing the
conflictsDetected++

delete(links, id)

// trigger another reconcile
r.QueueReconcile()
} else {
return fmt.Errorf("error updating resource: %w", err)
}
Expand Down Expand Up @@ -147,5 +148,9 @@ func (ctrl *LinkMergeController) Run(ctx context.Context, r controller.Runtime,
}
}
}

if conflictsDetected > 0 {
return fmt.Errorf("%d conflict(s) detected", conflictsDetected)
}
}
}
13 changes: 9 additions & 4 deletions internal/app/machined/pkg/controllers/network/route_merge.go
Original file line number Diff line number Diff line change
Expand Up @@ -86,6 +86,8 @@ func (ctrl *RouteMergeController) Run(ctx context.Context, r controller.Runtime,
routes[id] = route
}

conflictsDetected := 0

for id, route := range routes {
route := route

Expand All @@ -97,12 +99,11 @@ func (ctrl *RouteMergeController) Run(ctx context.Context, r controller.Runtime,
return nil
}); err != nil {
if state.IsPhaseConflictError(err) {
logger.Debug("conflict detected", zap.String("id", id))
// phase conflict, resource is being torn down, skip updating it and trigger reconcile
// later by failing the
conflictsDetected++

delete(routes, id)

// trigger another reconcile
r.QueueReconcile()
} else {
return fmt.Errorf("error updating resource: %w", err)
}
Expand Down Expand Up @@ -131,5 +132,9 @@ func (ctrl *RouteMergeController) Run(ctx context.Context, r controller.Runtime,
}
}
}

if conflictsDetected > 0 {
return fmt.Errorf("%d conflict(s) detected", conflictsDetected)
}
}
}

0 comments on commit 7f8e50d

Please sign in to comment.