-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Do not close filestream harvester if an unexpected error is returned when close.on_state_change.* is enabled #26411
Do not close filestream harvester if an unexpected error is returned when close.on_state_change.* is enabled #26411
Conversation
Pinging @elastic/agent (Team:Agent) |
This pull request doesn't have a |
This pull request is now in conflicts. Could you fix it? 🙏
|
…am input with close_* options
7973b7b
to
8ab0707
Compare
💚 Build Succeeded
Expand to view the summary
Build stats
Test stats 🧪
Trends 🧪💚 Flaky test reportTests succeeded. Expand to view the summary
Test stats 🧪
|
This pull request is now in conflicts. Could you fix it? 🙏
|
return true | ||
} | ||
|
||
// If an unexpected error happens we keep the reader open hoping once everything will go back to normal. | ||
f.log.Errorf("Unexpected error reading from %s; error: %s", f.file.Name(), statErr) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one still might produce a false-positive log message. The shouldBeClosed
method is called by a separate go-routine independent from the reader go-routine. When the reader go-routine shuts down (cancelled event) while shouldBeClosed
is run we will see an error telling us that f
is invalid because it was already closed.
In order to prevent the false-positive message, we either need a mutex or some ref-counting on f
+ a wrapper on the reader that allows us to unblock
a pending read call if the file gets closed async.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's try to solve the race in a separate PR.
…when close.on_state_change.* is enabled (#26411) ## What does this PR do? This PR returns early if `close.on_state_change.removed` is enabled and the opened file no longer exists. Otherwise, it logs an error message and keeps the reader running. ## Why is it important? Previously, a message has been logged on error level and the reader has been stopped if the `Stat` call returned an error. However, it was not correct because if `close.on_state_change.renamed` was enabled the reader would have been closed if the file had been removed. Now the reader is not stopped. (cherry picked from commit 1106449)
…when close.on_state_change.* is enabled (#26411) ## What does this PR do? This PR returns early if `close.on_state_change.removed` is enabled and the opened file no longer exists. Otherwise, it logs an error message and keeps the reader running. ## Why is it important? Previously, a message has been logged on error level and the reader has been stopped if the `Stat` call returned an error. However, it was not correct because if `close.on_state_change.renamed` was enabled the reader would have been closed if the file had been removed. Now the reader is not stopped. (cherry picked from commit 1106449)
…when close.on_state_change.* is enabled (#26411) (#26477) ## What does this PR do? This PR returns early if `close.on_state_change.removed` is enabled and the opened file no longer exists. Otherwise, it logs an error message and keeps the reader running. ## Why is it important? Previously, a message has been logged on error level and the reader has been stopped if the `Stat` call returned an error. However, it was not correct because if `close.on_state_change.renamed` was enabled the reader would have been closed if the file had been removed. Now the reader is not stopped. (cherry picked from commit 1106449) Co-authored-by: Noémi Ványi <kvch@users.noreply.github.com>
…when close.on_state_change.* is enabled (#26411) (#26476) ## What does this PR do? This PR returns early if `close.on_state_change.removed` is enabled and the opened file no longer exists. Otherwise, it logs an error message and keeps the reader running. ## Why is it important? Previously, a message has been logged on error level and the reader has been stopped if the `Stat` call returned an error. However, it was not correct because if `close.on_state_change.renamed` was enabled the reader would have been closed if the file had been removed. Now the reader is not stopped. (cherry picked from commit 1106449) Co-authored-by: Noémi Ványi <kvch@users.noreply.github.com>
* master: (32 commits) [Metricbeat] Change Account ID to Project ID in `gcp.billing` module (elastic#26412) update libbeat fields.ecs.yml file and ecsVersion to 1.10.0 (elastic#26121) [Filebeat] Update AWS ELB ingest pipeline (elastic#26441) [FIlebeat] add strict_date_optional_time_nanos date format to PanOS module (elastic#26158) Fix the irregular and typo on prometheus module. (elastic#25726) [Filebeat] Parse additonal debug data fields for Okta module (elastic#25818) fix: update MSSQL Server linux image's Docker registry (elastic#26440) Update indexing.go godocs (elastic#26408) Do not close filestream harvester if an unexpected error is returned when close.on_state_change.* is enabled (elastic#26411) Add support for copytruncate method when rotating input logs with an external tool in `filestream` input (elastic#23457) Allow fields with ip_range datatype (elastic#26444) Add Anomali ThreatStream support to threatintel module (elastic#26350) fix: use the right param type (elastic#26469) [Automation] Update elastic stack version to 8.0.0-7640093f for testing (elastic#26460) Set SM Filebeat modules as GA (elastic#26226) Fix rfc5464 date parsing in the syslog input (elastic#26419) Add linked account information into billing metricset (elastic#26285) [Filebeat] Update HA Proxy log grok patterns (elastic#25835) disable metricbeat logstash test_node_stats (elastic#26436) chore: pass BEAT_VERSION when running E2E tests (elastic#26291) ...
What does this PR do?
This PR returns early if
close.on_state_change.removed
is enabled and the opened file no longer exists. Otherwise, it logs an error message and keeps the reader running.Why is it important?
Previously, a message has been logged on error level and the reader has been stopped if the
Stat
call returned an error. However, it was not correct because ifclose.on_state_change.renamed
was enabled the reader would have been closed if the file had been removed. Now the reader is not stopped.Checklist
- [ ] I have made corresponding changes to the documentation- [ ] I have made corresponding change to the default configuration files- [ ] I have added tests that prove my fix is effective or that my feature worksCHANGELOG.next.asciidoc
orCHANGELOG-developer.next.asciidoc
.