-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Fleet] Implement agent integration health reporting #158826
[Fleet] Implement agent integration health reporting #158826
Conversation
🤖 GitHub commentsExpand to view the GitHub comments
Just comment with:
|
The documentation issue is here: elastic/ingest-docs#209 |
56f8e5d
to
f29cded
Compare
FYI I've pasted screenshots of the latest WIP iteration in the PR description above, showing various parts of the integration status tree. Note that the policy response part for Elastic Defend is not part of this change, it is existing behaviour that I am showing for reference. The change affects the health status reported on integration inputs. Your input will be much appreciated on various points:
cc @jlind23 |
6ffa626
to
2f662c9
Compare
Thanks @jillguyonnet for the demo. Continuing from the call:
|
...ions/fleet/sections/agents/agent_details_page/components/agent_details/input_status_utils.ts
Outdated
Show resolved
Hide resolved
...ions/agents/agent_details_page/components/agent_details/agent_details_integration_inputs.tsx
Show resolved
Hide resolved
...ions/agents/agent_details_page/components/agent_details/agent_details_integration_inputs.tsx
Outdated
Show resolved
Hide resolved
if (!agent.components) { | ||
return packageErrorUnits; | ||
} | ||
return getInputStatusFromAgent(agent.components, packagePolicy).filter( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion for a shorter way with lodash:
hasPackageErrors = agent.components ? some(getInputStatusFromAgent(agent.components, packagePolicy), unit => unit.status === 'DEGRADED' || unit.status === 'FAILED') : false;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this @juliaElastic - there actually was some leftover unnecessary code there which I've removed.
Pinging @elastic/fleet (Team:Fleet) |
agentComponents: FleetServerAgentComponent[], | ||
packagePolicy: PackagePolicy | ||
): FleetServerAgentComponentUnit[] => { | ||
const re = new RegExp(`(${packagePolicy.id}|${packagePolicy.package?.name})`); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the regex on package name might return invalid results in case we have a package name that includes the name of another. We could use a stricter regex like ^name/
.
@elastic/elastic-agent-control-plane Could you suggest what would be the best way to identify component units that belong to a package/package policy?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Elastic Agent itself only understands inputs, not packages. Any information that would allow tying an input back to a package is injected by Fleet.
Looks like you have meta.package
and package_policy_id
to work with. The input ID has to be unique in the agent policy and looks like it is templated from something like $inputType-$package-$packagePolicyID
so that might be a good hint for what to use if you want a unique value (you can probably find the code that generates this ID faster than I can :) )
- id: system/metrics-system-4f510cb9-2f4e-4b81-8a19-9969abe1c924
name: system-1
revision: 1
type: system/metrics
use_output: default
meta:
package:
name: system
version: 1.31.1
data_stream:
namespace: default
package_policy_id: 4f510cb9-2f4e-4b81-8a19-9969abe1c924
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for taking a look at this @juliaElastic - this is indeed one of the critical pieces of logic.
Interestingly, the yaml @cmacknz pasted above has a FullAgentPolicy type which we don't have access without an extra API call. While it would be easier (allowing us to simply retrieve units by input id), I'm not sure it's worth it. I have made a change where we only match the unit id against the package policy id; this means we only retrieve inputs, not outputs. Since we only surface the state of inputs, this should be sufficient for the present requirements.
Please let me know your thoughts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree to keep it simple and match on package policy id that works on inputs. We can come back to this later when we add support for outputs.
Looks good overall, asked a question about the regex used to find the package units. |
@zombieFox @karenzone we didn't get any feedback from the demo Jill gave so I take it as an approval and we will then move forward with the current state of this PR. |
081b9a9
to
f413296
Compare
@juliaElastic I've pushed some changes addressing your review 🙏 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 🚀
💚 Build Succeeded
Metrics [docs]Module Count
Async chunks
Unknown metric groupsESLint disabled line counts
Total ESLint disabled count
History
To update your PR or re-run it, just comment with: |
## Summary Fix a bug in the health reporting UI change in #158826 where the page breaks when the agent has no components. Closes #159975 ### Checklist - [x] [Unit or functional tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html) were updated or added to match the most common scenarios - [ ] Any UI touched in this PR is usable by keyboard only (learn more about [keyboard accessibility](https://webaim.org/techniques/keyboard/)) - [ ] Any UI touched in this PR does not create any new axe failures (run axe in browser: [FF](https://addons.mozilla.org/en-US/firefox/addon/axe-devtools/), [Chrome](https://chrome.google.com/webstore/detail/axe-web-accessibility-tes/lhdoppojpmngadmnindnejefpokejbdd?hl=en-US)) - [ ] This renders correctly on smaller devices using a responsive layout. (You can test this [in your browser](https://www.browserstack.com/guide/responsive-testing-on-local-server)) - [ ] This was checked for [cross-browser compatibility](https://www.elastic.co/support/matrix#matrix_browsers) --------- Co-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>
Summary
Implement agent integration health reporting in Fleet UI.
Closes #154634
Screenshots
These screenshots were taken with an error (invalid config) on the system integration (metrics).
Before
The error affecting the
system
integration is not visible in the UI. To find it, the user would need to inspect the agent JSON or run theelastic-agent status
command.After
The error affecting the
system
integration is surfaced in the UI:For reference, the following screenshots show existing behaviour with the Elastic Defend integration (errors in the policy response):
Steps to reproduce
Checklist
Delete any items that are not applicable to this PR.