-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Include individual resource sync errors in Sync events #970
Conversation
cluster/sync.go
Outdated
Error error | ||
} | ||
|
||
type SyncErrors []SyncError |
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
event/event.go
Outdated
@@ -204,6 +204,8 @@ type SyncEventMetadata struct { | |||
// policy changes, and "other" (meaning things we didn't commit | |||
// ourselves) | |||
Includes map[string]bool `json:"includes,omitempty"` | |||
// Per-resource errors | |||
Errors map[string]string `json:"errors,omitempty"` |
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
cluster/manifests.go
Outdated
@@ -21,7 +21,7 @@ type Manifests interface { | |||
// spec given. | |||
UpdateDefinition(def []byte, container string, newImageID image.Ref) ([]byte, error) | |||
// Load all the resource manifests under the path given | |||
LoadManifests(paths ...string) (map[string]resource.Resource, error) | |||
LoadManifests(base, first string, rest ...string) (map[string]resource.Resource, error) |
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
93bfd7c
to
a141155
Compare
PTAL @Sambooo |
Before we had an interface representing resources, they were generally passed around as a pair of (id string, definition []byte). We can cut down on the number of representations of resources by just using `resource.Resource`. This includes returning synchronisation errors with the whole resource, rather than simply by name. Doing so means we will be able to better surface sync problems with individual resources.
Include an error report in the sync notification, with resources v. errors (with the latter coming from `kubectl` most likely). The most useful bit of information when a resource fails to sync -- more useful than the error from kubectl, even -- is the file that had a problem. Include that in the notification. Secondarily: to avoid having a long, tmpfile path in messages, make the source of resources relative to the repo.
Instead of using a map, which forces us to give a glommed-together string as the key (so that it can be JSONed), use a slice of structs with the unglommed data.
a141155
to
cdca842
Compare
Newer flux daemons (since fluxcd/flux#970) report any sync failures for individual resources. Take advantage of this by including the errors in sync notifications.
A sync can be a partial success, in that we skip lightly over individual resources failing to be applied. This means that bogus manifests don't get noticed, and the symptoms can be difficult to pin down if you are used to thinking
cluster == git
.This PR surfaces those errors by including them in the Sync event, as described in #756. (It fixes #756, in fact.)
There are other measures we could take, e.g., annotating offending commits (maybe? we may not be able to pin it down); keeping a gauge of sync failures which can be alerted on. But this is a fair start.