Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cherry-pick #21048 to 7.9: Filebeat: Fix random error on harvester close #21056

Merged
merged 1 commit into from
Sep 14, 2020

Conversation

adriansr
Copy link
Contributor

@adriansr adriansr commented Sep 10, 2020

Cherry-pick of PR #21048 to 7.9 branch. Original message:

This fixes a race condition when a harvester is closed at the same time that its source file size is being calculated.

Why is it important?

Random Error updating file size errors have been reported. Other than that it looks like the bug has no other effect.

Checklist

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have made corresponding change to the default configuration files
  • I have added tests that prove my fix is effective or that my feature works
  • I have added an entry in CHANGELOG.next.asciidoc or CHANGELOG-developer.next.asciidoc.

Author's Checklist

  • Make sure the new synchronization is safe

How to test this PR locally

It's easy to trigger this error by removing the Ticker in monitorFileSize and busy loop calling h.updateCurrentSize(), with a configuration that uses close_eof and many files.

Logs

Error updating file size: GetFileInformationByHandleEx [...path...]: The handle is invalid.; File: [...path...]

This fixes a race condition when a harvester is closed at the same time
that its source file size is being calculated.

Before this fix, `Error updating file size` errors are randomly printed, because
the file handle becomes closed inside the os.Stat call.

(cherry picked from commit af6222d)
@adriansr adriansr added [zube]: In Review backport Team:Services (Deprecated) Label for the former Integrations-Services team Team:SIEM labels Sep 10, 2020
@elasticmachine
Copy link
Collaborator

Pinging @elastic/integrations-services (Team:Services)

@botelastic botelastic bot added the needs_team Indicates that the issue/PR needs a Team:* label label Sep 10, 2020
@elasticmachine
Copy link
Collaborator

Pinging @elastic/siem (Team:SIEM)

@botelastic botelastic bot removed the needs_team Indicates that the issue/PR needs a Team:* label label Sep 10, 2020
@adriansr adriansr requested a review from kvch September 10, 2020 14:40
@elasticmachine
Copy link
Collaborator

💔 Tests Failed

Pipeline View Test View Changes Artifacts preview

Expand to view the summary

Build stats

  • Build Cause: [Pull request #21056 opened]

  • Start Time: 2020-09-10T14:41:40.686+0000

  • Duration: 56 min 5 sec

Test stats 🧪

Test Results
Failed 1
Passed 5539
Skipped 824
Total 6364

Test errors

Expand to view the tests failures

  • Name: Build and Test / Filebeat Mac OS X / test_input_filter_includefields – test_input.Test

    • Age: 1
    • Duration: 10.196
    • Error Details: Timeout waiting for 'cond' to be true. Waited 10 seconds.

Steps errors

Expand to view the steps failures

  • Name: Mage build unitTest

    • Description: mage build unitTest

    • Duration: 8 min 13 sec

    • Start Time: 2020-09-10T15:05:28.498+0000

    • log

  • Name: Recursively delete the current directory from the workspace

    • Description: script returned exit code 1

    • Duration: 0 min 14 sec

    • Start Time: 2020-09-10T15:13:57.505+0000

    • log

Log output

Expand to view the last 100 lines of log output

[2020-09-10T15:36:29.562Z]  Go version:        go1.13.10
[2020-09-10T15:36:29.562Z]  Git commit:        48a66213fe
[2020-09-10T15:36:29.562Z]  Built:             Mon Jun 22 15:45:36 2020
[2020-09-10T15:36:29.562Z]  OS/Arch:           linux/amd64
[2020-09-10T15:36:29.562Z]  Experimental:      false
[2020-09-10T15:36:29.562Z] 
[2020-09-10T15:36:29.562Z] Server: Docker Engine - Community
[2020-09-10T15:36:29.562Z]  Engine:
[2020-09-10T15:36:29.562Z]   Version:          19.03.12
[2020-09-10T15:36:29.562Z]   API version:      1.40 (minimum version 1.12)
[2020-09-10T15:36:29.562Z]   Go version:       go1.13.10
[2020-09-10T15:36:29.562Z]   Git commit:       48a66213fe
[2020-09-10T15:36:29.562Z]   Built:            Mon Jun 22 15:44:07 2020
[2020-09-10T15:36:29.562Z]   OS/Arch:          linux/amd64
[2020-09-10T15:36:29.562Z]   Experimental:     false
[2020-09-10T15:36:29.562Z]  containerd:
[2020-09-10T15:36:29.562Z]   Version:          1.2.13
[2020-09-10T15:36:29.562Z]   GitCommit:        7ad184331fa3e55e52b890ea95e65ba581ae3429
[2020-09-10T15:36:29.562Z]  runc:
[2020-09-10T15:36:29.562Z]   Version:          1.0.0-rc10
[2020-09-10T15:36:29.562Z]   GitCommit:        dc9208a3303feef5b3839f4323d9beb36df0a9dd
[2020-09-10T15:36:29.562Z]  docker-init:
[2020-09-10T15:36:29.562Z]   Version:          0.18.0
[2020-09-10T15:36:29.562Z]   GitCommit:        fec3683
[2020-09-10T15:36:29.562Z] Unable to find image 'alpine:3.4' locally
[2020-09-10T15:36:30.135Z] 3.4: Pulling from library/alpine
[2020-09-10T15:36:30.135Z] c1e54eec4b57: Pulling fs layer
[2020-09-10T15:36:30.397Z] c1e54eec4b57: Download complete
[2020-09-10T15:36:30.662Z] c1e54eec4b57: Pull complete
[2020-09-10T15:36:30.662Z] Digest: sha256:b733d4a32c4da6a00a84df2ca32791bb03df95400243648d8c539e7b4cce329c
[2020-09-10T15:36:30.662Z] Status: Downloaded newer image for alpine:3.4
[2020-09-10T15:36:32.368Z] + python .ci/scripts/pre_archive_test.py
[2020-09-10T15:36:33.312Z] Copy ./filebeat/build into build/filebeat/build
[2020-09-10T15:36:33.338Z] Running in /var/lib/jenkins/workspace/Beats_beats_PR-21056/src/github.com/elastic/beats/build
[2020-09-10T15:36:33.391Z] Recording test results
[2020-09-10T15:36:34.962Z] Stashed 4 file(s)
[2020-09-10T15:36:34.984Z] Archiving artifacts
[2020-09-10T15:36:35.615Z] + python .ci/scripts/search_system_tests.py
[2020-09-10T15:36:35.644Z] [INFO] system-tests='build/filebeat/build/system-tests'. If no empty then let's create a tarball
[2020-09-10T15:36:36.015Z] + tar --version
[2020-09-10T15:36:36.393Z] + tar --exclude=filebeat--system-tests-linux.tgz -czf filebeat--system-tests-linux.tgz build/filebeat/build/system-tests
[2020-09-10T15:36:41.712Z] Archiving artifacts
[2020-09-10T15:36:57.569Z] [INFO] unstashV2: JOB_GCS_BUCKET is set. bucket param got precedency instead.
[2020-09-10T15:36:57.607Z] [INFO] unstashV2: JOB_GCS_CREDENTIALS is set. credentialsId param got precedency instead.
[2020-09-10T15:36:57.714Z] [Google Cloud Storage Plugin] Found 1 files to download from pattern: gs://beats-ci-temp/Beats/beats/PR-21056-1/source/source.tgz
[2020-09-10T15:36:57.734Z] [Google Cloud Storage Plugin] Downloading: Beats/beats/PR-21056-1/source/source.tgz to local path: /var/lib/jenkins/workspace/Beats_beats_PR-21056/source.tgz
[2020-09-10T15:37:07.707Z] + tar --version
[2020-09-10T15:37:08.034Z] + tar -xpf source.tgz
[2020-09-10T15:37:20.668Z] + rm source.tgz
[2020-09-10T15:37:20.694Z] Running in /var/lib/jenkins/workspace/Beats_beats_PR-21056/src/github.com/elastic/beats
[2020-09-10T15:37:20.745Z] Running in /var/lib/jenkins/workspace/Beats_beats_PR-21056/src/github.com/elastic/beats/Lint
[2020-09-10T15:37:20.911Z] Running in /var/lib/jenkins/workspace/Beats_beats_PR-21056/src/github.com/elastic/beats/Filebeat-x-pack-Mac-OS-X
[2020-09-10T15:37:21.081Z] Running in /var/lib/jenkins/workspace/Beats_beats_PR-21056/src/github.com/elastic/beats/Filebeat-Mac-OS-X
[2020-09-10T15:37:21.249Z] Running in /var/lib/jenkins/workspace/Beats_beats_PR-21056/src/github.com/elastic/beats/Filebeat-Windows
[2020-09-10T15:37:21.415Z] Running in /var/lib/jenkins/workspace/Beats_beats_PR-21056/src/github.com/elastic/beats/Filebeat-x-pack-Windows
[2020-09-10T15:37:21.584Z] Running in /var/lib/jenkins/workspace/Beats_beats_PR-21056/src/github.com/elastic/beats/Filebeat-x-pack
[2020-09-10T15:37:21.751Z] Running in /var/lib/jenkins/workspace/Beats_beats_PR-21056/src/github.com/elastic/beats/Filebeat-oss
[2020-09-10T15:37:22.297Z] + cat
[2020-09-10T15:37:22.297Z] + /usr/local/bin/runbld ./runbld-script --job-name elastic+beats+pull-request
[2020-09-10T15:37:22.297Z] Picked up JAVA_TOOL_OPTIONS: -Dfile.encoding=UTF8
[2020-09-10T15:37:30.472Z] runbld>>> runbld started
[2020-09-10T15:37:30.472Z] runbld>>> 1.6.12/f45d832f2ba0aa2722ab4ec1fda8ad140f027f8b
[2020-09-10T15:37:31.862Z] runbld>>> The following profiles matched the job 'elastic+beats+pull-request' in order of occurrence in the config (last value wins).
[2020-09-10T15:37:31.862Z] runbld>>> Matches in the system config:
[2020-09-10T15:37:31.862Z] runbld>>> - Matched ^elastic\+beats
[2020-09-10T15:37:31.862Z] runbld>>> - Matched ^elastic\+beats\+pull-request
[2020-09-10T15:37:33.251Z] runbld>>> Debug logging enabled.
[2020-09-10T15:37:33.251Z] runbld>>> Storing result
[2020-09-10T15:37:33.251Z] runbld>>> Store result: created {:total 2, :successful 2, :failed 0} 1
[2020-09-10T15:37:33.251Z] runbld>>> BUILD: https://c150076387b5421f9154dfbf536e5c60.us-west1.gcp.cloud.es.io:9243/build-1597739501209/t/20200910153732-AF131C11
[2020-09-10T15:37:33.251Z] runbld>>> Adding system facts.
[2020-09-10T15:37:34.200Z] runbld>>> Adding vcs info for the latest commit:  ad6223251ea913ce359cd39eb958811b36b9b6aa
[2020-09-10T15:37:34.201Z] runbld>>> >>>>>>>>>>>> SCRIPT EXECUTION BEGIN >>>>>>>>>>>>
[2020-09-10T15:37:34.463Z] runbld>>> Adding /usr/lib/jvm/java-8-openjdk-amd64/bin to the path.
[2020-09-10T15:37:34.463Z] Processing JUnit reports with runbld...
[2020-09-10T15:37:34.463Z] + echo 'Processing JUnit reports with runbld...'
[2020-09-10T15:37:35.037Z] runbld>>> <<<<<<<<<<<< SCRIPT EXECUTION END <<<<<<<<<<<<
[2020-09-10T15:37:35.037Z] runbld>>> DURATION: 23ms
[2020-09-10T15:37:35.037Z] runbld>>> STDOUT: 40 bytes
[2020-09-10T15:37:35.037Z] runbld>>> STDERR: 49 bytes
[2020-09-10T15:37:35.037Z] runbld>>> WRAPPED PROCESS: SUCCESS (0)
[2020-09-10T15:37:35.037Z] runbld>>> Searching for build metadata in /var/lib/jenkins/workspace/Beats_beats_PR-21056
[2020-09-10T15:37:35.983Z] runbld>>> Storing build metadata: 
[2020-09-10T15:37:35.983Z] runbld>>> Adding test report.
[2020-09-10T15:37:35.983Z] runbld>>> Searching for junit test output files with the pattern: TEST-.*\.xml$ in: /var/lib/jenkins/workspace/Beats_beats_PR-21056/src/github.com/elastic/beats
[2020-09-10T15:37:36.557Z] runbld>>> Found 16 test output files
[2020-09-10T15:37:38.482Z] runbld>>> Test output logs contained: Errors: 1 Failures: 0 Tests: 6364 Skipped: 798
[2020-09-10T15:37:38.482Z] runbld>>> Storing result
[2020-09-10T15:37:38.482Z] runbld>>> FAILURES: 1
[2020-09-10T15:37:38.746Z] runbld>>> Store result: updated {:total 2, :successful 2, :failed 0} 2
[2020-09-10T15:37:38.746Z] runbld>>> BUILD: https://c150076387b5421f9154dfbf536e5c60.us-west1.gcp.cloud.es.io:9243/build-1597739501209/t/20200910153732-AF131C11
[2020-09-10T15:37:38.746Z] runbld>>> Email notification disabled by environment variable.
[2020-09-10T15:37:38.746Z] runbld>>> Slack notification disabled by environment variable.
[2020-09-10T15:37:44.300Z] Running on Jenkins in /var/lib/jenkins/workspace/Beats_beats_PR-21056
[2020-09-10T15:37:44.493Z] [INFO] getVaultSecret: Getting secrets
[2020-09-10T15:37:44.590Z] Masking supported pattern matches of $VAULT_ADDR or $VAULT_ROLE_ID or $VAULT_SECRET_ID
[2020-09-10T15:37:45.736Z] + chmod 755 generate-build-data.sh
[2020-09-10T15:37:45.736Z] + ./generate-build-data.sh https://beats-ci.elastic.co/blue/rest/organizations/jenkins/pipelines/Beats/beats/PR-21056/ https://beats-ci.elastic.co/blue/rest/organizations/jenkins/pipelines/Beats/beats/PR-21056/runs/1 FAILURE 3364773
[2020-09-10T15:37:45.737Z] INFO: curl https://beats-ci.elastic.co/blue/rest/organizations/jenkins/pipelines/Beats/beats/PR-21056/runs/1/steps/?limit=10000 -o steps-info.json
[2020-09-10T15:37:46.648Z] INFO: curl https://beats-ci.elastic.co/blue/rest/organizations/jenkins/pipelines/Beats/beats/PR-21056/runs/1/tests/?status=FAILED -o tests-errors.json

@adriansr adriansr merged commit 31d9b2a into elastic:7.9 Sep 14, 2020
@zube zube bot removed the [zube]: Done label Dec 14, 2020
leweafan pushed a commit to leweafan/beats that referenced this pull request Apr 28, 2023
…c#21056)

This fixes a race condition when a harvester is closed at the same time
that its source file size is being calculated.

Before this fix, `Error updating file size` errors are randomly printed, because
the file handle becomes closed inside the os.Stat call.

(cherry picked from commit 918f17a)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport Team:Services (Deprecated) Label for the former Integrations-Services team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants