Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Filestream sometimes does not update registry offset when file is truncated #38070

Closed
belimawr opened this issue Feb 20, 2024 · 3 comments · Fixed by #38416
Closed

Filestream sometimes does not update registry offset when file is truncated #38070

belimawr opened this issue Feb 20, 2024 · 3 comments · Fixed by #38416
Assignees
Labels
bug Team:Elastic-Agent Label for the Agent team

Comments

@belimawr
Copy link
Contributor

belimawr commented Feb 20, 2024

Description

In some situations when a file is truncated Filebeat will detect the truncation, close the reader but it will not update the registry.

This happens in two situations:

  1. If Filebeat is stopped, the file is truncated then when Filebeat starts up the offset is not reset to zero and it keeps increasing with the new data read. This causes Filebeat to always detect the file as truncated on every restart and to re-ingest all data.
  2. If the file is truncated while Filebeat is running and Filebeat is stopped after the reader is closed but before a new harvester/prospector are started.

Steps to reproduce case 1

  1. Create a log file with some content, let's say 100 lines

  2. Start Filebeat with the following configuration

    filebeat.yml

    filebeat.inputs:
      - type: filestream
        id: filestream-input-id
        enabled: true
        paths:
          - /tmp/test.log
    
    output:
      file:
        enabled: true
        codec.json:
          pretty: false
        path: ${path.home}
        filename: "output"
        rotate_on_startup: false
    
    queue.mem:
      flush:
        timeout: 1s
        min_events: 32
    
    filebeat.registry.flush: 1s
    
    logging:
      level: debug
      selectors:
        - crawler
        - file_watcher
        - input
        - input.filestream
        - input.harvester
        - registrar
        - scanner
        - service
      metrics:
        enabled: false

  3. Look at the offset in the registry and TTL. The offset should match the file size and the TTL should be 1800000000000.

  4. Wait until the whole file is ingested

  5. Stop Filebeat

  6. Truncate the file

  7. Add some data, let's say 5 lines. It should result in a smaller file than the original

  8. Restart Filebeat

  9. Assert:

    1. The new content is ingested
    2. The offset in the registry is larger than the file size.

Logs

Logs from the second run where we can see the file truncation being detected

{"log.level":"info","@timestamp":"2024-02-20T16:59:02.093+0100","log.logger":"input.filestream","log.origin":{"function":"github.com/elastic/beats/v7/filebeat/input/v2/compat.(*runner).Start.func1","file.name":"compat/compat.go","file.line":121},"message":"Input 'filestream' starting","service.name":"filebeat","id":"filestream-input-id","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2024-02-20T16:59:02.093+0100","log.logger":"metric_registry","log.origin":{"function":"github.com/elastic/beats/v7/libbeat/monitoring/inputmon.NewInputRegistry","file.name":"inputmon/input.go","file.line":63},"message":"registering","service.name":"filebeat","input_type":"filestream","id":"filestream-input-id","key":"filestream-input-id","uuid":"4175231d-a5c3-451b-b054-6b316ef1fba5","ecs.version":"1.6.0"}                                                                                                                                                                                                                                                 
{"log.level":"debug","@timestamp":"2024-02-20T16:59:02.093+0100","log.logger":"input.filestream","log.origin":{"function":"github.com/elastic/beats/v7/filebeat/input/filestream.(*fileProspector).Run","file.name":"filestream/prospector.go","file.line":133},"message":"Starting prospector","service.name":"filebeat","id":"filestream-input-id","prospector":"file_prospector","ecs.version":"1.6.0"}
{"log.level":"debug","@timestamp":"2024-02-20T16:59:02.093+0100","log.logger":"file_watcher","log.origin":{"function":"github.com/elastic/beats/v7/filebeat/input/filestream.(*fileWatcher).watch","file.name":"filestream/fswatch.go","file.line":120},"message":"Start next scan","service.name":"filebeat","ecs.version":"1.6.0"}
{"log.level":"debug","@timestamp":"2024-02-20T16:59:02.093+0100","log.logger":"file_watcher","log.origin":{"function":"github.com/elastic/beats/v7/filebeat/input/filestream.(*fileWatcher).watch","file.name":"filestream/fswatch.go","file.line":224},"message":"File scan complete","service.name":"filebeat","total":1,"written":0,"truncated":0,"renamed":0,"removed":0,"created":1,"ecs.version":"1.6.0"}
{"log.level":"debug","@timestamp":"2024-02-20T16:59:02.093+0100","log.logger":"input.filestream","log.origin":{"function":"github.com/elastic/beats/v7/filebeat/input/filestream.(*fileProspector).onFSEvent","file.name":"filestream/prospector.go","file.line":179},"message":"A new file /tmp/test.log has been found","service.name":"filebeat","id":"filestream-input-id","prospector":"file_prospector","operation":"create","source_name":"native::65140-34","os_id":"65140-34","new_path":"/tmp/test.log","ecs.version":"1.6.0"}
{"log.level":"debug","@timestamp":"2024-02-20T16:59:02.093+0100","log.logger":"input.filestream","log.origin":{"function":"github.com/elastic/beats/v7/filebeat/input/filestream/internal/input-logfile.(*defaultHarvesterGroup).Start","file.name":"input-logfile/harvester.go","file.line":139},"message":"Starting harvester for file","service.name":"filebeat","id":"filestream-input-id","source_file":"filestream::filestream-input-id::native::65140-34","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2024-02-20T16:59:02.093+0100","log.logger":"input.filestream","log.origin":{"function":"github.com/elastic/beats/v7/filebeat/input/filestream.(*filestream).openFile","file.name":"filestream/input.go","file.line":275},"message":"File was truncated. Reading file from offset 0. Path=/tmp/test.log","service.name":"filebeat","id":"filestream-input-id","source_file":"filestream::filestream-input-id::native::65140-34","path":"/tmp/test.log","state-id":"native::65140-34","ecs.version":"1.6.0"}
{"log.level":"debug","@timestamp":"2024-02-20T16:59:02.093+0100","log.logger":"input.filestream","log.origin":{"function":"github.com/elastic/beats/v7/filebeat/input/filestream.(*filestream).open","file.name":"filestream/input.go","file.line":185},"message":"newLogFileReader with config.MaxBytes:10485760","service.name":"filebeat","id":"filestream-input-id","source_file":"filestream::filestream-input-id::native::65140-34","path":"/tmp/test.log","state-id":"native::65140-34","ecs.version":"1.6.0"}
{"log.level":"debug","@timestamp":"2024-02-20T16:59:02.093+0100","log.logger":"input.filestream","log.origin":{"function":"github.com/elastic/beats/v7/filebeat/input/filestream.(*logFile).Read","file.name":"filestream/filestream.go","file.line":131},"message":"End of file reached: /tmp/test.log; Backoff now.","service.name":"filebeat","id":"filestream-input-id","source_file":"filestream::filestream-input-id::native::65140-34","path":"/tmp/test.log","state-id":"native::65140-34","ecs.version":"1.6.0"}

File size before truncation and adding new data

ls -li /tmp/test.log 
65140 -rw-r--r-- 1 tiago tiago 1997 Feb 20 16:58 /tmp/test.log

Final file size

ls -li /tmp/test.log 
65140 -rw-r--r-- 1 tiago tiago 331 Feb 20 16:58 /tmp/test.log

Last 3 entries from the registry

filestream::filestream-input-id::native::65140-34	1861	1800000000000
filestream::filestream-input-id::native::65140-34	1861	1800000000000
filestream::filestream-input-id::native::65140-34	2192	1800000000000

Expected behaviour

The registry is correctly updated to the actual file size/offset read once file truncation is detected.

Steps to reproduce case 2

This case can use the same configuration as case 1.

  1. Create a log file with some content, keep adding content regularly, let's say every second.
  2. Start Filebeat. You can set prospector.scanner.check_interval: 30s to have more time between truncating the file and Filebeat picking it up again
  3. Truncate the file. truncate -s 0 /tmp/test.log on Linux
  4. Wait for the following log messages
    {"log.level":"debug","@timestamp":"2024-02-20T19:40:17.703+0100","log.logger":"input.filestream","log.origin":{"function":"github.com/elastic/beats/v7/filebeat/input/filestream.(*logFile).handleEOF","file.name":"filestream/filestream.go","file.line":259},"message":"File was truncated as offset (773) > size (34): /tmp/test.log","service.name":"filebeat","id":"filestream-input-id","source_file":"filestream::filestream-input-id::native::69162-34","path":"/tmp/test.log","state-id":"native::69162-34","ecs.version":"1.6.0"}
    {"log.level":"info","@timestamp":"2024-02-20T19:40:17.703+0100","log.logger":"input.filestream","log.origin":{"function":"github.com/elastic/beats/v7/filebeat/input/filestream.(*filestream).readFromSource","file.name":"filestream/input.go","file.line":334},"message":"File was truncated, nothing to read. Path='/tmp/test.log'","service.name":"filebeat","id":"filestream-input-id","source_file":"filestream::filestream-input-id::native::69162-34","path":"/tmp/test.log","state-id":"native::69162-34","ecs.version":"1.6.0"}
    {"log.level":"debug","@timestamp":"2024-02-20T19:40:17.703+0100","log.logger":"input.filestream","log.origin":{"function":"github.com/elastic/beats/v7/filebeat/input/filestream/internal/input-logfile.startHarvester.func1","file.name":"input-logfile/harvester.go","file.line":247},"message":"Stopped harvester for file","service.name":"filebeat","id":"filestream-input-id","source_file":"filestream::filestream-input-id::native::69162-34","ecs.version":"1.6.0"}
    {"log.level":"debug","@timestamp":"2024-02-20T19:40:17.703+0100","log.logger":"input.filestream","log.origin":{"function":"github.com/elastic/beats/v7/filebeat/input/filestream.(*filestream).Run.func1","file.name":"filestream/input.go","file.line":151},"message":"Closing reader of filestream","service.name":"filebeat","id":"filestream-input-id","source_file":"filestream::filestream-input-id::native::69162-34","path":"/tmp/test.log","state-id":"native::69162-34","ecs.version":"1.6.0"}
    
  5. Stop Filebeat
  6. Assert that the registry was not updated with an offset of 0
  7. Make sure some data has been added to the file since the truncation
  8. Restart Filebeat
  9. Assert:
    1. Filebeat detects the truncation
    2. The new content is read and sent to the output
    3. The offset in the registry is bigger than the file size.
    4. Adding more data to the file will make the offset in the registry increase.

Logs

After restarting Filebeat it detects the truncation

{"log.level":"debug","@timestamp":"2024-02-20T19:43:06.309+0100","log.logger":"input.filestream","log.origin":{"function":"github.com/elastic/beats/v7/filebeat/input/filestream.(*fileProspector).onFSEvent","file.name":"filestream/prospector.go","file.line":179},"message":"A new file /tmp/test.log has been found","service.name":"filebeat","id":"filestream-input-id","prospector":"file_prospector","operation":"create","source_name":"native::69162-34","os_id":"69162-34","new_path":"/tmp/test.log","ecs.version":"1.6.0"}
{"log.level":"debug","@timestamp":"2024-02-20T19:43:06.309+0100","log.logger":"input.filestream","log.origin":{"function":"github.com/elastic/beats/v7/filebeat/input/filestream/internal/input-logfile.(*defaultHarvesterGroup).Start","file.name":"input-logfile/harvester.go","file.line":139},"message":"Starting harvester for file","service.name":"filebeat","id":"filestream-input-id","source_file":"filestream::filestream-input-id::native::69162-34","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2024-02-20T19:43:06.309+0100","log.logger":"input.filestream","log.origin":{"function":"github.com/elastic/beats/v7/filebeat/input/filestream.(*filestream).openFile","file.name":"filestream/input.go","file.line":275},"message":"File was truncated. Reading file from offset 0. Path=/tmp/test.log","service.name":"filebeat","id":"filestream-input-id","source_file":"filestream::filestream-input-id::native::69162-34","path":"/tmp/test.log","state-id":"native::69162-34","ecs.version":"1.6.0"}

ls -li shows the moment the file was truncated, you can see the size going down and then starting to grow again

69162 -rw-r--r-- 1 tiago tiago 637 Feb 20 19:40 /tmp/test.log
69162 -rw-r--r-- 1 tiago tiago 671 Feb 20 19:40 /tmp/test.log
69162 -rw-r--r-- 1 tiago tiago 705 Feb 20 19:40 /tmp/test.log
69162 -rw-r--r-- 1 tiago tiago 739 Feb 20 19:40 /tmp/test.log
69162 -rw-r--r-- 1 tiago tiago 773 Feb 20 19:40 /tmp/test.log
69162 -rw-r--r-- 1 tiago tiago 807 Feb 20 19:40 /tmp/test.log
69162 -rw-r--r-- 1 tiago tiago 34 Feb 20 19:40 /tmp/test.log
69162 -rw-r--r-- 1 tiago tiago 68 Feb 20 19:40 /tmp/test.log
69162 -rw-r--r-- 1 tiago tiago 102 Feb 20 19:40 /tmp/test.log
69162 -rw-r--r-- 1 tiago tiago 136 Feb 20 19:40 /tmp/test.log
69162 -rw-r--r-- 1 tiago tiago 170 Feb 20 19:40 /tmp/test.log
69162 -rw-r--r-- 1 tiago tiago 204 Feb 20 19:40 /tmp/test.log

The last few entries in the registry do not have an offset of 0 nor it ever decreases. The registry is read using tail -F data/registry/filebeat/log.json|jq -r '[.k, .v.cursor.offset, .v.ttl] | @tsv'

filestream::filestream-input-id::native::69162-34       569     1800000000000
filestream::filestream-input-id::native::69162-34       637     1800000000000
filestream::filestream-input-id::native::69162-34       705     1800000000000
filestream::filestream-input-id::native::69162-34       773     1800000000000
filestream::filestream-input-id::native::69162-34       773     1800000000000        
filestream::filestream-input-id::native::69162-34       1147    1800000000000

Expected behaviour

The registry is correctly updated to the actual file size/offset read once file truncation is detected.

@belimawr belimawr added bug Team:Elastic-Agent Label for the Agent team labels Feb 20, 2024
@elasticmachine
Copy link
Collaborator

Pinging @elastic/elastic-agent (Team:Elastic-Agent)

@belimawr belimawr changed the title Filestream does not update registry offset when file is truncated while Filebeat is offline Filestream does not update registry offset when file is truncated Feb 20, 2024
@belimawr belimawr changed the title Filestream does not update registry offset when file is truncated Filestream sometimes does not update registry offset when file is truncated Feb 20, 2024
@belimawr belimawr self-assigned this Feb 22, 2024
@belimawr
Copy link
Contributor Author

Cross posting a possible wokaround: #36541 (comment)

@redbaron4
Copy link

Another workaround is to temporarily switch to deprecated (but otherwise fine) log input because it does not suffer from this problem as far as I can tell.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Team:Elastic-Agent Label for the Agent team
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants