Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Filebeat 5.0.0-alpha4 losing messages #2041

Closed
tsg opened this issue Jul 15, 2016 · 3 comments
Closed

Filebeat 5.0.0-alpha4 losing messages #2041

tsg opened this issue Jul 15, 2016 · 3 comments
Assignees
Labels

Comments

@tsg
Copy link
Contributor

tsg commented Jul 15, 2016

On test done by @robin13 it seems that filebeat can drop lines when it's stopped with ^C during processing. The test does:

  • a setup with filebeat and logstash with minimal configs (see below)
  • start filebeat to read a file with 10k lines
  • in the middle of filebeat processing, kill it with ^C
  • restart filebeat, wait for it to process the whole file
  • look at the output written by LS, sometimes it has less then 10k lines

Config files:

filebeat:
  prospectors:
    -
      paths:
        - /home/rclarke/elastic/cases/90209/source.json
      input_type: log

output:
  logstash:
    hosts: ["localhost:17046"]

and LS:

input {
    beats {
        port => 17046
        codec => json
    }
}


output {
    file {
        codec => json_lines
        path => "/home/rclarke/elastic/cases/90209/output.json"
    }
    stdout {
        codec => dots
    }
}
@robin13
Copy link
Contributor

robin13 commented Jul 15, 2016

Attached are more simplified tests with sample source and a bash script to run the test multiple times.

I ran the test against logstash 5.0.0 alpha3 10 times each with following settings and results:

  • filebeat 1.2.3 with SIGKILL or SIGHUP
    • 3 times the correct number of lines were produced, but the rest all had 6-15% duplicates.
  • filebeat 5.0.0 alpha4 and SIGKILL
    • 7 times the correct number of lines were produced but the rest had 10-70% events lost.
  • filebeat 5.0.0 alpha4 and SIGHUP
    • 1 times the correct number of lines were produced but the rest had 6-20% duplicates.

Surely an orderly shutdown with SIGHUP should allow filebeat to wait for any ACK from logstash (within some reasonable timeout) before stopping the process?

test.tar.gz

Note - for the script to work you must start logstash with the -r restart option. The line in the script which adds a comment to the logstash configuration causes logstash to reload the configuration and flush the lines. Logshash file output should flush everything to file after 5 seconds, but does not seem to be honouring this now...

@urso
Copy link

urso commented Jul 15, 2016

modified test script (compare file length right after filebeat stop):

#!/bin/bash

SIGNAL=SIGTERM

FILEBEAT=..//filebeat
ARGS="-path.home ./ -c ./filebeat.yml"
RUN="$FILEBEAT $ARGS"

for i in {1..10}
do
    echo "Test $i"
    rm -rf ./data/registry
    rm -rf .filebeat
    rm -rf ./output.txt
    $RUN &
    pid=$!
    sleep 1
    echo "Killing filebeat: $pid"
    kill -s $SIGNAL $pid
    wait $pid
    sleep 1

    # # Cause logstash to restart which will flush all remaining events to file output.
    # #echo "Causing logstash to restart to flush lines"
    echo "#" `date` >> logstash.conf
    sleep 5

    offset=$(jq '.[0].offset' data/registry)
    echo "output lines: $(wc -l < output.txt)"
    echo "input lines(offset=$offset): $(head -c $offset  source.txt | wc -l)"

    sleep 5
done

Have had to increase testfile by factor of 5, when testing locally.

@urso
Copy link

urso commented Jul 18, 2016

Fix has been merged. closing.

@urso urso closed this as completed Jul 18, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants