Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Fleet] Add test/fix for invalid/missing ids in bulk agent reassign #94632

Merged
merged 8 commits into from
Mar 17, 2021

Conversation

jfsiii
Copy link
Contributor

@jfsiii jfsiii commented Mar 15, 2021

Problem

While working on changes for bulk reassign #90437, I found that the server has a runtime error and returns a 500 if given an invalid or missing id.

server error stack trace
   │ proc [kibana] server    log   [12:21:48.953] [error][fleet][plugins] TypeError: Cannot read property 'policy_revision_idx' of undefined
   │ proc [kibana]     at map (/Users/jfsiii/work/kibana/x-pack/plugins/fleet/server/services/agents/helpers.ts:15:34)
   │ proc [kibana]     at Array.map (<anonymous>)
   │ proc [kibana]     at getAgents (/Users/jfsiii/work/kibana/x-pack/plugins/fleet/server/services/agents/crud.ts:191:32)
   │ proc [kibana]     at runMicrotasks (<anonymous>)
   │ proc [kibana]     at processTicksAndRejections (internal/process/task_queues.js:93:5)
   │ proc [kibana]     at Object.reassignAgents (/Users/jfsiii/work/kibana/x-pack/plugins/fleet/server/services/agents/reassign.ts:91:9)
   │ proc [kibana]     at postBulkAgentsReassignHandler (/Users/jfsiii/work/kibana/x-pack/plugins/fleet/server/routes/agent/handlers.ts:314:21)
   │ proc [kibana]     at Router.handle (/Users/jfsiii/work/kibana/src/core/server/http/router/router.ts:272:30)
   │ proc [kibana]     at handler (/Users/jfsiii/work/kibana/src/core/server/http/router/router.ts:227:11)
   │ proc [kibana]     at exports.Manager.execute (/Users/jfsiii/work/kibana/node_modules/@hapi/hapi/lib/toolkit.js:60:28)
   │ proc [kibana]     at Object.internals.handler (/Users/jfsiii/work/kibana/node_modules/@hapi/hapi/lib/handler.js:46:20)
   │ proc [kibana]     at exports.execute (/Users/jfsiii/work/kibana/node_modules/@hapi/hapi/lib/handler.js:31:20)
   │ proc [kibana]     at Request._lifecycle (/Users/jfsiii/work/kibana/node_modules/@hapi/hapi/lib/request.js:370:32)
   │ proc [kibana]     at Request._execute (/Users/jfsiii/work/kibana/node_modules/@hapi/hapi/lib/request.js:279:9)
see test added in this PR fail on master
1)    Fleet Endpoints
       reassign agent(s)
         bulk reassign agents
           should allow to reassign multiple agents by id -- some invalid:

      Error: expected 200 "OK", got 500 "Internal Server Error"

Root cause

Debugging runtime error in searchHitToAgent found some TS type mismatches for the ES values being returned. Perhaps from one or more of the recent changes to ES client & Fleet Server. Based on test:jest and test:ftr, it appears the possible types are GetResponse or SearchResponse, instead of only an ESSearchHit.

https://github.com/elastic/kibana/pull/94632/files#diff-254d0f427979efc3b442f78762302eb28fb9c8857df68ea04f8d411e052f939cL11

While a .search result will include return matched values, a .get or .mget will return a row for each input and a found: boolean. e.g. { _id: "does-not-exist", found: false }. The error occurs when searchHitToAgent is run on a get miss instead of a search hit.

PR Changes

  • Added a test to ensure it doesn't fail if invalid or missing IDs are given
  • Moved the bulk_reassign tests to their own test section
  • Filter out any missing results before calling searchHitToAgent, to match current behavior
  • Consolidate repeated arguments into and code for getting agents into single function: and TS type
  • Rename some agent service functions to be more explicit (IMO) but behavior maintained. Same API names exported.

This moves toward the "one result (success or error) per given id" approach for #90437

Checklist

@jfsiii jfsiii marked this pull request as ready for review March 16, 2021 18:00
@jfsiii jfsiii requested a review from a team as a code owner March 16, 2021 18:00
@jfsiii jfsiii added release_note:skip Skip the PR/issue when compiling release notes Team:Fleet Team label for Observability Data Collection Fleet team labels Mar 16, 2021
@elasticmachine
Copy link
Contributor

Pinging @elastic/fleet (Team:Fleet)

})
).agents;
} else {
throw new IngestManagerError('Cannot get agents');
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can surely improve this. Not sure what it should say. Something about required options?

@@ -161,34 +184,51 @@ export async function countInactiveAgents(
return res.body.hits.total.value;
}

export async function getAgent(esClient: ElasticsearchClient, agentId: string) {
export async function getAgentById(esClient: ElasticsearchClient, agentId: string) {
const agentNotFoundError = new AgentNotFoundError(`Agent ${agentId} not found`);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found it weird to create an error each time we call this function, I would duplicate this were we throw the error

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought it was preferable to waste a few microseconds creating a value we might not use to ensure the exit/error values remained in sync.

Copy link
Member

@nchaulet nchaulet Mar 16, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can ensure we have the same error using a function like createAgentNotFoundError for example

getAgent,
listAgents,
getAgent: getAgentById,
listAgents: getAgentsByKuery,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am wondering if the fact that we do not use the same function name internally and externally will cause some communication issue in the future.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair question. We can update these as well, I just held off while we're iterating.

Would you like me to update them now?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this can be done in a follow up PR, as it's probably going to take some time to be reviewed as it concern multiple team

@jfsiii
Copy link
Contributor Author

jfsiii commented Mar 16, 2021

@nchaulet anything you think should be changed before merging? Anything that could/should happen in a follow up PR?

Copy link
Member

@nchaulet nchaulet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nothing blocking from merging 👍

@jfsiii jfsiii enabled auto-merge (squash) March 16, 2021 23:38
@jfsiii jfsiii added the auto-backport Deprecated - use backport:version if exact versions are needed label Mar 16, 2021
@kibanamachine
Copy link
Contributor

💚 Build Succeeded

Metrics [docs]

✅ unchanged

History

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

@jfsiii jfsiii merged commit 044a94a into elastic:master Mar 17, 2021
@kibanamachine
Copy link
Contributor

💔 Backport failed

❌ 7.x: Commit could not be cherrypicked due to conflicts

To backport manually, check out the target branch and run:
node scripts/backport --pr 94632

jfsiii pushed a commit that referenced this pull request Mar 17, 2021
…94632) (#94774)

## Problem
While working on changes for bulk reassign #90437, I found that the server has a runtime error and returns a 500 if given an invalid or missing id.

<details><summary>server error stack trace</summary>

```
   │ proc [kibana] server    log   [12:21:48.953] [error][fleet][plugins] TypeError: Cannot read property 'policy_revision_idx' of undefined
   │ proc [kibana]     at map (/Users/jfsiii/work/kibana/x-pack/plugins/fleet/server/services/agents/helpers.ts:15:34)
   │ proc [kibana]     at Array.map (<anonymous>)
   │ proc [kibana]     at getAgents (/Users/jfsiii/work/kibana/x-pack/plugins/fleet/server/services/agents/crud.ts:191:32)
   │ proc [kibana]     at runMicrotasks (<anonymous>)
   │ proc [kibana]     at processTicksAndRejections (internal/process/task_queues.js:93:5)
   │ proc [kibana]     at Object.reassignAgents (/Users/jfsiii/work/kibana/x-pack/plugins/fleet/server/services/agents/reassign.ts:91:9)
   │ proc [kibana]     at postBulkAgentsReassignHandler (/Users/jfsiii/work/kibana/x-pack/plugins/fleet/server/routes/agent/handlers.ts:314:21)
   │ proc [kibana]     at Router.handle (/Users/jfsiii/work/kibana/src/core/server/http/router/router.ts:272:30)
   │ proc [kibana]     at handler (/Users/jfsiii/work/kibana/src/core/server/http/router/router.ts:227:11)
   │ proc [kibana]     at exports.Manager.execute (/Users/jfsiii/work/kibana/node_modules/@hapi/hapi/lib/toolkit.js:60:28)
   │ proc [kibana]     at Object.internals.handler (/Users/jfsiii/work/kibana/node_modules/@hapi/hapi/lib/handler.js:46:20)
   │ proc [kibana]     at exports.execute (/Users/jfsiii/work/kibana/node_modules/@hapi/hapi/lib/handler.js:31:20)
   │ proc [kibana]     at Request._lifecycle (/Users/jfsiii/work/kibana/node_modules/@hapi/hapi/lib/request.js:370:32)
   │ proc [kibana]     at Request._execute (/Users/jfsiii/work/kibana/node_modules/@hapi/hapi/lib/request.js:279:9)
```
</details>

<details><summary>see test added in this PR fail on master</summary>

```
1)    Fleet Endpoints
       reassign agent(s)
         bulk reassign agents
           should allow to reassign multiple agents by id -- some invalid:

      Error: expected 200 "OK", got 500 "Internal Server Error"
```
</details>

## Root cause
Debugging runtime error in `searchHitToAgent` found some TS type mismatches for the ES values being returned. Perhaps from one or more of the recent changes to ES client & Fleet Server. Based on `test:jest` and `test:ftr`, it appears the possible types are `GetResponse` or `SearchResponse`, instead of only an `ESSearchHit`.

https://github.com/elastic/kibana/pull/94632/files#diff-254d0f427979efc3b442f78762302eb28fb9c8857df68ea04f8d411e052f939cL11

While a `.search` result will include return matched values, a `.get` or `.mget` will return a row for each input and a `found: boolean`. e.g. `{ _id: "does-not-exist", found: false }`. The error occurs when [`searchHitToAgent`](https://github.com/jfsiii/kibana/blob/1702cf98f018c41ec0a080d829a12403168ac242/x-pack/plugins/fleet/server/services/agents/helpers.ts#L11) is run on a get miss instead of a search hit.

## PR Changes
* Added a test to ensure it doesn't fail if invalid or missing IDs are given
* Moved the `bulk_reassign` tests to their own test section
* Filter out any missing results before calling `searchHitToAgent`, to match current behavior
* Consolidate repeated arguments into and code for getting agents into single [function](https://github.com/elastic/kibana/pull/94632/files#diff-f7377ed9ad56eaa8ea188b64e957e771ccc7a7652fd1eaf44251c25b930f8448R70-R87):  and [TS type](https://github.com/elastic/kibana/pull/94632/files#diff-f7377ed9ad56eaa8ea188b64e957e771ccc7a7652fd1eaf44251c25b930f8448R61-R68)
* Rename some agent service functions to be more explicit (IMO) but behavior maintained. Same API names exported.

This moves toward the "one result (success or error) per given id" approach for #90437

### Checklist
- [x] [Unit or functional tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html) were updated or added to match the most common scenarios

# Conflicts:
#	x-pack/plugins/fleet/server/services/agents/crud.ts
#	x-pack/plugins/fleet/server/services/index.ts
jloleysens added a commit that referenced this pull request Mar 17, 2021
…-action

* 'master' of github.com:elastic/kibana: (44 commits)
  Migrate the optimizer mixin to core (#94272)
  Replace EuiCodeBlock with JsonCodeEditor in DiscoverGrid (#92442)
  [ML] Anomaly Detection: Migrate validation messages links to use docLinks. (#94568)
  [Lists][Exceptions] - Adding basic linting, i18n and storybook support (#94772)
  [Fleet] Add test/fix for invalid/missing ids in bulk agent reassign (#94632)
  [Security Solution] [Cases] Move create page components and dependencies to Cases (#94444)
  [ML] Data Frame Analytics accessibility tests: fix flaky outlier creation test (#94735)
  [Security Solutions] Remove commented out old linter rules (#94753)
  [App Search] Role mappings migration part 2 (#94461)
  [CI] Update Backport action inputs to match updated ones (#94721)
  [chore] Remove the infra team from CODEOWNERS (#94740)
  [Connectors UI] Make UI use new connector APIs (#94180)
  [ML] Use indices options in anomaly detection job wizards (#91830)
  Remove `string` as a valid registry var type value (#94174)
  [Alerts] Replaces legacy es client with the ElasticsearchClient for alerts and triggers_actions_ui plugins. (#93364)
  [Reporting-CSV Export] Re-write CSV Export using SearchSource (#88303)
  chore(NA): upgrade bazel rules nodejs to v3.2.2 (#94726)
  [APM] Settings: Update layout and update/add descriptions (#94398)
  skip flaky suite (#94666)
  [XY axis] Integrates legend color picker with the eui palette (#90589)
  ...

# Conflicts:
#	x-pack/plugins/index_lifecycle_management/public/application/sections/edit_policy/components/phases/cold_phase/cold_phase.tsx
#	x-pack/plugins/index_lifecycle_management/public/application/sections/edit_policy/components/phases/shared_fields/searchable_snapshot_field/searchable_snapshot_field.tsx
#	x-pack/plugins/index_lifecycle_management/public/application/sections/edit_policy/constants.ts
#	x-pack/plugins/index_lifecycle_management/public/application/sections/edit_policy/edit_policy.tsx
#	x-pack/plugins/index_lifecycle_management/public/application/sections/edit_policy/form/configuration_context.tsx
#	x-pack/plugins/index_lifecycle_management/public/shared_imports.ts
@jfsiii jfsiii deleted the 90437-bulk-action-api-changes branch April 6, 2021 17:08
@dikshachauhan-qasource
Copy link

Hi @EricDavisX

Could you please through some of the limelight here, how to validate this.

Thanks
QAS

@EricDavisX
Copy link
Contributor

@dikshachauhan-qasource no need this time, John has confirmed that he wrote a test for the problem - we can ignore this from the manual test side.

@dikshachauhan-qasource
Copy link

Thank you @EricDavisX for the update.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auto-backport Deprecated - use backport:version if exact versions are needed release_note:skip Skip the PR/issue when compiling release notes Team:Fleet Team label for Observability Data Collection Fleet team v7.13.0 v8.0.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants