Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Alerting & Actions] Overwrite SOs when updating instead of partially updating #73688

Merged
merged 60 commits into from
Sep 18, 2020

Conversation

gmmorris
Copy link
Contributor

@gmmorris gmmorris commented Jul 29, 2020

Summary

This PR changes the Alerts & Actions clients to ensure they require full updates (rather than partial) to SOs and overwrites the entire document when making the update.
This is to prevent the situation where nested objects get merged instead of replaced when a user makes an update.

We also enhanced the EncryptedSavedObjectsClient to allow specified ids when overwriting an existing object.

closes #71995

Checklist

Delete any items that are not applicable to this PR.

For maintainers

@gmmorris gmmorris added Feature:Actions Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams) v7.10.0 v8.0.0 release_note:fix labels Jul 29, 2020
@gmmorris
Copy link
Contributor Author

@elasticmachine merge upstream

elasticmachine and others added 5 commits July 29, 2020 11:49
…ibana into actions/webhook-remove-header

* 'actions/webhook-remove-header' of github.com:gmmorris/kibana: (86 commits)
  [maps] rename GisMap to MapContainer and convert to TS (elastic#73690)
  [APM] docs: remove watcher documentation  (elastic#73485)
  [Maps] fix fit to data for Point to Point layer (elastic#73563)
  [Metrics UI] Fix No Data in Inventory alerts/Snapshot API (elastic#72513)
  [ML] Disabling ML if license feature is disabled (elastic#73187)
  [ML] Fixing old _xpack style es endpoint paths (elastic#73667)
  [DOCS] [Lens] 7.9 docs refresh (elastic#72301)
  [ML] DF Analytics results: ensure `View` link is only enabled when job has successfully completed (elastic#73539)
  Set timeRange to default to trigger the error message (elastic#73629)
  [ML] Functional tests - stabilize DFA navigation and index pattern handling (elastic#73660)
  [ILM] Add links to "Snapshot and Restore" from ILM "wait for snapshot policy" (elastic#72473)
  [kbn-storybook] Update Storybook to 5.3.19 (elastic#73320)
  [Metrics UI] Fix hasData call to ensure it has data not just indices (elastic#72969)
  [Uptime] Use `service.name` to link from Uptime -> APM where available (elastic#73618)
  allow others to update `URL.revokeObjectURL` property if needed (elastic#73639)
  regen docs (elastic#73650)
  [Visualize] Fix inspector download filename issue when saving in-place (elastic#72605)
  [Data] Query Input String manager (elastic#72093)
  [Security Solutions] Add tooltips (elastic#73436)
  Do not render descriptionless actions within an EuiCard (elastic#73611)
  ...
@gmmorris gmmorris marked this pull request as ready for review July 30, 2020 16:45
@gmmorris gmmorris requested review from a team as code owners July 30, 2020 16:45
@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-alerting-services (Team:Alerting Services)

@legrego legrego requested a review from azasypkin July 30, 2020 19:44
@azasypkin
Copy link
Member

azasypkin commented Jul 31, 2020

ACK: will review today or on Monday at the latest.

* master: (54 commits)
  [ML] Migrate to React BrowserRouter and Kibana provided History. (elastic#71941)
  [Discover] Improve  saveSearch functional test handling (elastic#73626)
  [Metrics UI] Fix all threshold alert conditions disappearing due to alert prefill (elastic#73708)
  [Metrics UI] Fix alert previews of ungrouped alerts (elastic#73735)
  [SIEM] Fixes "include building block button" to operate (elastic#73900)
  [Metrics UI] Fix alert management to open without refresh (elastic#73739)
  [Security Solution][Lists] - Tests cleanup and remove unnecessary import (elastic#73865)
  [Ingest Management] main branch uses epr-snapshot. Others production (elastic#73555)
  [Canvas][tech-debt] Fix SVG not shrinking vertically properly (elastic#73867)
  [Maps] upgrade turf (elastic#73816)
  [Security Solution][Telemetry] Concurrent telemetry requests (elastic#73558)
  [Security Solution][Exceptions] - Update how nested entries are displayed in exceptions viewer (elastic#73745)
  [Security Solution][Exceptions] Adds autocomplete workaround for .text fields (elastic#73761)
  [Metrics UI] Fix previewing of No Data results (elastic#73753)
  Closes elastic#72914 by hiding anomaly detection settings links when the ml plugin is disabled. (elastic#73638)
  [Ingest Manager] Fix config selection in enrollment flyout from config list page (elastic#73833)
  [DOCS] Fixes typo in Alerting actions (elastic#73756)
  [APM] fixes linking errors to ML and Discover (elastic#73758)
  Handle promise rejections when building artifacts (elastic#73831)
  [Security Solution][Detections] Change from sha1 to sha256 (elastic#73741)
  ...
Copy link
Member

@azasypkin azasypkin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Just a couple of comments.

@@ -200,7 +200,7 @@ export class EncryptedSavedObjectsClientWrapper implements SavedObjectsClientCon
public async update<T>(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: is there any reason you didn't switch ESO bulkUpdate to bulkCreate?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haha - so to be honest, I missed it 😬

But - this is actually a problem.
We can't seem to change the deep Partial<T> to be T in the bulkUpdate signature (a problem we didn't have with update... I think because it isn't contained inside of an Array so the downstream type is a subset of the one upstream).

This means - we can't actually change bulkUpdate to require a T instead of Partial<T> without changing ESO to not implement SavedObjectsClientContract.

Any thoughts? @legrego as well? :/

Copy link
Contributor

@mikecote mikecote left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We've refrained from doing this type of change to ESO before because of the optimistic concurrency control that the alerts currently depend on (passing the version into the update statement and fail if the alert changed since it was last read). I don't think it would be a good idea to let that go as we rely more and more on optimistic concurrency control for updating new attributes down the line (state, status, etc).

Most problems with AAD are planned to be fixed with custom code in this issue: #67290 (comment). It seems another case that approach would solve?

@gmmorris
Copy link
Contributor Author

gmmorris commented Aug 5, 2020

We've refrained from doing this type of change to ESO before because of the optimistic concurrency control that the alerts currently depend on (passing the version into the update statement and fail if the alert changed since it was last read). I don't think it would be a good idea to let that go as we rely more and more on optimistic concurrency control for updating new attributes down the line (state, status, etc).

I spoke to Platform about this last week and we agreed to add version to the create API before merging this.

Most problems with AAD are planned to be fixed with custom code in this issue: #67290 (comment). It seems another case that approach would solve?

This won't address the problem we have in other places where we don't know what the shape is - such as the state of an alert, the config/params of actions and alerts provided by solutions etc.

Seems to me adding version to SavedObjects and using that to address the problem both when using ESO and in places where we use SO (such as Task Manager), is a better investment, no?

@gmmorris gmmorris marked this pull request as draft August 5, 2020 10:47
* master: (115 commits)
  [Logs UI] Correct trial period duration in anomaly splash screen (elastic#74249)
  [Discover] Inline noWhiteSpace function (elastic#74331)
  [DOCS] Add Observability topic (elastic#73041)
  skip flaky suite (elastic#74327)
  [Security Solution][Detections] Fixes Severity Override not matching for Elastic Endpoint Security rule (elastic#74317)
  [Security Solution][Exceptions] - Fixes exceptions builder nested deletion issue and adds unit tests (elastic#74250)
  Fixed Alert details does not update page title and breadcrumb (elastic#74214)
  [src/dev/build] build Kibana Platform bundles from source (elastic#73591)
  [Reporting] Shorten asset path to help CLI FS Watcher (elastic#74185)
  Fix TMS not loaded in legacy maps (elastic#73570)
  [Security Solution] styling for notes' panel (elastic#74274)
  [Security Solution][Tech Debt] cleans up ts-ignore issues and some smaller linter issues  (elastic#74268)
  Make the actions plugin support generics (elastic#71439)
  [Security Solution] Keep original note creator (elastic#74203)
  [CI] Fix xpack kibana build dir in xpack visual regression script
  [CI] Fix baseline_capture job by adding parallel process number back
  [Monitoring] Ensure setup mode works on cloud but only for alerts (elastic#73127)
  [Maps] Custom color ramps should show correctly on the map for mvt layers (elastic#74169)
  [kbn/optimizer] remove unused modules (elastic#74195)
  [CI] Add pipeline task queue framework and merge workers into one (elastic#71268)
  ...
@mikecote
Copy link
Contributor

mikecote commented Aug 5, 2020

I spoke to Platform about this last week and we agreed to add version to the create API before merging this.

Seems to me adding version to SavedObjects and using that to address the problem both when using ESO and in places where we use SO (such as Task Manager), is a better investment, no?

Ah, I didn't know version actually worked with the index API, seems like it does! \o/

This approach may solve #67290 at the same time then. The issue was looking for alternatives based on the wrong assumption (that the index API doesn't support version).

@azasypkin
Copy link
Member

ACK: looking...

Copy link
Member

@azasypkin azasypkin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code review only: LGTM. Thanks!

* master: (71 commits)
  [Lens] Show 'No data for this field' for empty field in accordion (elastic#73772)
  Skip failing lens test
  Configure ScopedHistory consistenty regardless of URL used to mount app (elastic#75074)
  Fix returned payload by "search" usage collector (elastic#75340)
  [Security Solution] Fix missing key error (elastic#75576)
  Upgrade EUI to v27.4.1 (elastic#75240)
  Update datasets UI copy to data streams (elastic#75618)
  [Lens] Register saved object references (elastic#74523)
  [DOCS] Update links to Beats documentation (elastic#70380)
  [Enterprise Search] Convert our `public_url` route to `config_data` and collect initialAppData (elastic#75616)
  [Usage Collection Schemas] Remove Legacy entries (elastic#75652)
  [Dashboard First] Lens Originating App Breadcrumb (elastic#75470)
  Improve login UI error message. (elastic#75642)
  [Security Solution] modify circular deps checker to output images of circular deps graphs (elastic#75579)
  [Data Telemetry] Add index pattern to identify "meow" attacks (elastic#75163)
  Migrate CSP usage collector to `kibana_usage_collection` plugin (elastic#75536)
  [Console] Get ES Config from core (elastic#75406)
  [Uptime] Add delay in telemetry test (elastic#75162)
  [Lens] Use index pattern service instead saved object client (elastic#74654)
  Embeddable input (elastic#73033)
  ...
Copy link
Member

@pmuellr pmuellr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, left some comments, including two comments about comments!

@@ -103,13 +103,13 @@ export class EncryptedSavedObjectsClientWrapper implements SavedObjectsClientCon
// Saved objects with encrypted attributes should have IDs that are hard to guess especially
// since IDs are part of the AAD used during encryption, that's why we control them within this
// wrapper and don't allow consumers to specify their own IDs directly.
if (object.id) {
if (object.id && !options?.overwrite) {
throw new Error(
'Predefined IDs are not allowed for saved objects with encrypted attributes.'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this message is no longer quite correct

expect(alertsClientParams.invalidateAPIKey).not.toHaveBeenCalled();
});

test('throws when unsecuredSavedObjectsClient update fails', async () => {
unsecuredSavedObjectsClient.update.mockRejectedValueOnce(new Error('Fail'));
unsecuredSavedObjectsClient.create.mockRejectedValueOnce(new Error('Fail'));

await expect(alertsClient.updateApiKey({ id: '1' })).rejects.toThrowErrorMatchingInlineSnapshot(
`"Fail"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this file being > 4K lines always cracks me up

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't get me started 😆

await this.unsecuredSavedObjectsClient.update('alert', createdAlert.id, {
scheduledTaskId: scheduledTask.id,
});
await this.unsecuredSavedObjectsClient.create<RawAlert>(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this means we won't be using partial updates at all w/ the alert client? In this case, we're just updating the taskId, which is not AAD, so we could do a partial update, I'm thinking. I'm asking because the new "alert status" that we're adding to the alert SO is also not AAD, so it's unfortunate we'll have to a GET on the alert (or otherwise have all the data available for it) to update that, which will happen as part of running the alert executor.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may also have to partially update the alert status when the alert fails to decrypt (setting error status in this case)..

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ya, already looking at that - I think we're cool - we seem to be doing partial updates for other non-AAD things like muted. Biggest change is that we need to update the SO with the kibana system user, which is something I don't think we've had to deal with yet, but this doesn't seem like a problem either.

The alert status will both not be decrypted, and added to the non-AAD field list, so we can do a partial update, and the status can be retrieved even if decryption would have failed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should encapsulate the usage of partial updates and limit it to the places we have to do it.
I definitely don't want any public APIs (such as the client) which are doing partial updates as it has unexpected results that client do no expect or know about.

It seems safer to avoid partial updates where we can and then be very explicit about it when we do.

In this case we don't need to do a partial update, as we already have all the data we need- so I'd suggest we keep it a full update.

@mikecote mikecote removed the v7.9.1 label Aug 31, 2020
* master: (223 commits)
  skip flaky suite (elastic#75724)
  [Reporting] Add functional test for Reports in non-default spaces (elastic#76053)
  [Enterprise Search] Fix various icons in dark mode (elastic#76430)
  skip flaky suite (elastic#76245)
  Add `auto` interval to histogram AggConfig (elastic#76001)
  [Resolver] generator uses setup_node_env (elastic#76422)
  [Ingest Manager] Support both zip & tar archives from Registry (elastic#76197)
  [Ingest Manager] Improve agent vs kibana version checks (elastic#76238)
  Manually building `KueryNode` for Fleet's routes (elastic#75693)
  remove dupe tinymath section (elastic#76093)
  Create APM issue template (elastic#76362)
  Delete unused file. (elastic#76386)
  [SECURITY_SOLUTION][ENDPOINT] Trusted Apps Create API (elastic#76178)
  [Detections Engine] Add Alert actions to the Timeline (elastic#73228)
  [Dashboard First] Library Notification (elastic#76122)
  [Maps] Add mvt support for ES doc sources  (elastic#75698)
  Add setHeaderActionMenu API to AppMountParameters (elastic#75422)
  [ML] Remove "Are you sure" from data frame analytics jobs (elastic#76214)
  [yarn] remove typings-tester, use @ts-expect-error (elastic#76341)
  [Reporting/CSV] Do not fail the job if scroll ID can not be cleared (elastic#76014)
  ...
* master: (293 commits)
  Fix tsvb filter ration for table (elastic#77272)
  [Security Solution] Add unit tests for Network search strategy (elastic#77416)
  [Alerting] Improves performance of the authorization filter in AlertsClient.find by skipping KQL parsing (elastic#77040)
  [Ingest Manager] Add route for package installation by upload (elastic#77044)
  [APM-UI][E2E] filter PRs from the uptime GH team (elastic#77359)
  [APM] Remove useLocation and some minor route improvements (elastic#76343)
  [Enterprise Search] Update enterpriseSearchRequestHandler to manage range of errors + add handleAPIErrors helper (elastic#77258)
  [SECURITY_SOLUTION] Task/hostname policy response ux updates (elastic#76444)
  Move remaining uses of serviceName away from urlParams (elastic#77248)
  [Lens] Move configuration popover to flyout (elastic#76046)
  [Ingest Manager] Manually build Fleet kuery with Node arguments (elastic#76589)
  skip flaky suite (elastic#59975)
  Neutral-naming in reporting plugin (elastic#77371)
  [Enterprise Search] Add UserIcon styles (elastic#77385)
  [RUM Dashboard] Added loading state to visitor breakdown pie charts (elastic#77201)
  [Ingest Manager] Fix polling for new agent action (elastic#77339)
  Remote cluster - Functional UI test to change the superuser to a test_user with limited role (elastic#77212)
  Stacked headers and navigational search (elastic#72331)
  [ML] DF Analytics creation wizard: Fixing field loading race condition (elastic#77326)
  [Monitoring] Handle no mappings found for sort and collapse fields (elastic#77099)
  ...
* master: (54 commits)
  [Grok] Fix missing error message in error toasts (elastic#77499)
  [Alerting] Exempt Alerts pre 7.10 from RBAC on their Action execution until updated (elastic#75563)
  [ML] fix type in apply_influencer_filters_action (elastic#77495)
  [Lens] create reusable component for filters and range aggregation (elastic#77453)
  fix eslint violations
  Collapse alert chart previews by default (elastic#77479)
  [ML] Fixing field caps wrapper endpoint (elastic#77546)
  showing service maps when filte by environment not defined (elastic#77483)
  [Lens] Settings panel redesign and separate settings per y axis (elastic#76373)
  log request body in new ES client (elastic#77150)
  use `navigateToUrl` to navigate to recent nav links (elastic#77446)
  Move core config service to `kbn/config` package (elastic#76874)
  [UBI] Copy license to /licenses folder (elastic#77563)
  Skip flaky Events Viewer Cypress test
  [Lens] Remove dynamic names in telemetry fields (elastic#76988)
  [Maps] Add DynamicStyleProperty#getMbPropertyName and DynamicStyleProperty#getMbPropertyValue (elastic#77366)
  [Enterprise Search] Add flag to restrict width of layout (elastic#77539)
  [Security Solutions][Cases - Timeline] Fix bug when adding a timeline to a case (elastic#76967)
  [Security Solution][Detections] Integration test for Editing a Rule (elastic#77090)
  [Ingest pipelines] Polish pipeline debugging workflow (elastic#76058)
  ...
* master:
  [RUM Dashboard] User experience metrics (elastic#77384)
  Fixing service maps API test (elastic#77586)
@kibanamachine
Copy link
Contributor

💚 Build Succeeded

Build metrics

✅ unchanged

History

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

@gmmorris gmmorris merged commit fd624b1 into elastic:master Sep 18, 2020
gmmorris added a commit to gmmorris/kibana that referenced this pull request Sep 18, 2020
… updating (elastic#73688)

This PR changes the Alerts & Actions clients to ensure they require full updates (rather than partial) to SOs and overwrites the entire document when making the update.
This is to prevent the situation where nested objects get _merged_ instead of replaced when a user makes an `update`.

We also enhanced the EncryptedSavedObjectsClient to allow specified `id`s when overwriting an existing object.
gmmorris added a commit that referenced this pull request Sep 18, 2020
… updating (#73688) (#77857)

This PR changes the Alerts & Actions clients to ensure they require full updates (rather than partial) to SOs and overwrites the entire document when making the update.
This is to prevent the situation where nested objects get _merged_ instead of replaced when a user makes an `update`.

We also enhanced the EncryptedSavedObjectsClient to allow specified `id`s when overwriting an existing object.
gmmorris added a commit to gmmorris/kibana that referenced this pull request Sep 22, 2020
* master: (92 commits)
  [ILM] Data tiers for 7.10 (elastic#76126)
  [ML] Transforms: Fixes styling of preview grid pagination in summary step (elastic#77789)
  [Drilldowns] Beta badge support. Mark URL Drilldown as Beta (elastic#75654)
  Re-enable session lifespan, idle timeout api integration tests and use unique names for the security test reports. (elastic#77746)
  [Alerting] renames code in alerting RBAC exemption to make it easier to maintain (elastic#77598)
  [Alerting & Actions] Overwrite SOs when updating instead of partially updating (elastic#73688)
  fixed react warning in Suspense in alert flyout (elastic#77777)
  [APM] Track usage of Gold+ features (elastic#77630)
  Visualize: Bad request when working with histogram aggregation (elastic#77684)
  remove legacy ES plugin (elastic#77703)
  [Lens] change name of custom query to filters (elastic#77725)
  skip flaky suite (elastic#76239)
  remove visual aspects of baseline job (elastic#77815)
  skip flaky suite (elastic#77835)
  Fixes typo in data recognizer text (elastic#77691)
  management/update trusted_apps jest snapshot
  [build] Use Elastic hosted UBI minimal base image (elastic#77776)
  [APM] Add transaction error rate alert (elastic#76933)
  [Security Solution] [Detections] Remove file validation on import route (elastic#77770)
  [Enterprise Search][tech debt] Add Kea logic paths for easier debugging/defaults (elastic#77698)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feature:Actions release_note:fix Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams) v7.10.0 v8.0.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Alerting] Updating Webhook with removed headers doesn't work as expected
7 participants