-
Notifications
You must be signed in to change notification settings - Fork 148
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduce agent logs by default #4633
Conversation
Pinging @elastic/elastic-agent (Team:Elastic-Agent) |
Pinging @elastic/elastic-agent-control-plane (Team:Elastic-Agent-Control-Plane) |
This pull request does not have a backport label. Could you fix it @pchila? 🙏
NOTE: |
At this stage it addresses @cmacknz 's suggestion and removes some unnecessary logs, not sure if we want to stop here for this PR or we are going deeper, it depends on the measurement of how much data we are saving as it is and general 8.14 timeframe... Will change the "Closes" to "Relates" |
7931c95
to
cc97fae
Compare
I agree here, we shouldn't try to close #4252 but try to get as many improvement as possible for 8.14 and then continue for the next versions what we won't have time to do in that timeframe. |
cc97fae
to
2af6f43
Compare
Quality Gate passedIssues Measures |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I enrolled an agent built from this branch and observed 30s metrics in the local agent logs but not in Fleet as expected.
@cmacknz any objections with backporting this in 8.14? |
No objections to backporting. |
* set intermediate verification error logs to debug * Drop non-zero metrics periodic logs in monitoring config * Add script for elastic-agent logs and metrics disk size comparison * changelog (cherry picked from commit 584713c)
* set intermediate verification error logs to debug * Drop non-zero metrics periodic logs in monitoring config * Add script for elastic-agent logs and metrics disk size comparison * changelog (cherry picked from commit 584713c) Co-authored-by: Paolo Chilà <paolo.chila@elastic.co>
What does this PR do?
This PR drops the
Non-zero metrics after 30s...
logs before they get sent to Elasticsearch.Why is it important?
The goal is to reduce the amount of elastic-agent monitoring events ingested in Elasticsearch in order to reduce index disk size.
Checklist
[ ] I have made corresponding changes to the documentation[ ] I have made corresponding change to the default configuration files[ ] I have added tests that prove my fix is effective or that my feature works./changelog/fragments
using the changelog tool[ ] I have added an integration test or an E2E testAuthor's Checklist
How to test this PR locally
Run an elastic agent with and without this change on a deployment making a note of the start times and elastic-agent id of both.
Using the included ES script we then extract the documents ingested during those runs and store them in dedicated indices.
In my tests I ran 2 elastic agents managed by fleet with a default policy including system integration, first without this change and then including this change, then sliced the first 10 minutes of logs and metrics from startup using the included script.
This is a screenshot from the Index management page were we can see the disk sizes and document count for each index:
The measured impact on
*logs-elastic_agent.filebeat*
and*logs-elastic_agent.metricbeat*
is:*logs-elastic_agent.filebeat*
with a disk size reduction of ~18%*logs-elastic_agent.metricbeat*
with a disk size reduction of ~16%Related issues
Use cases
Screenshots
Logs
Questions to ask yourself