Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding a Debug reader to detect Null Bytes from the io.Reader #9210

Merged
merged 6 commits into from
Nov 26, 2018

Conversation

ph
Copy link
Contributor

@ph ph commented Nov 22, 2018

When you are reading logs on a network volume like NFS or ZFS it is
possible that the underlying filesystem return null bytes instead of
returning concrete data, its not currently possible to detect that in all
scenario unless events are eventually send to ES and you can inspect
them and see \u0000 chars in the messages.

This is a small proposal to add a Debug Reader which should only by used
for debugging purpose it allow to log if any null bytes are present in
the streams of bytes and will log surround values.

It accepts an io.Reader as the source of data, a buffer size, a
predicate to check the value of a byte and how much detection invokation it should do
before disabling the check.

Enable it with either of the following selectors: "*" or "detect_null_bytes"

@ph ph added discuss Issue needs further discussion. review Filebeat Filebeat libbeat labels Nov 22, 2018
@ph ph assigned urso Nov 22, 2018
@ph ph requested a review from urso November 22, 2018 16:55
@ph
Copy link
Contributor Author

ph commented Nov 22, 2018

@urso This is briefly what we discussed yesterday about null-bytes over network volume or the issue that we are still investigating at https://discuss.elastic.co/t/filbeat-still-ooms/156211/25

It should be pretty low impact even in production, if we detect too much null-bytes the reader will just disable itself.

But in the log at info level we will see that we have matched some bytes + surrounding data.

Questions

Should we log it at info level or should we enforce a special selector?
We could be at risk of exposing data.

@ph ph force-pushed the debug/large-buffer-issue-copy branch from 6377408 to 009e940 Compare November 22, 2018 17:09
@ph
Copy link
Contributor Author

ph commented Nov 22, 2018

Error is legit under windows, looking into it.

filebeat/input/log/config.go Outdated Show resolved Hide resolved
libbeat/reader/debug/debug.go Outdated Show resolved Hide resolved
libbeat/reader/debug/debug.go Outdated Show resolved Hide resolved
libbeat/reader/debug/debug.go Show resolved Hide resolved
libbeat/reader/debug/debug.go Outdated Show resolved Hide resolved
libbeat/reader/debug/debug.go Outdated Show resolved Hide resolved
libbeat/reader/debug/debug.go Outdated Show resolved Hide resolved
libbeat/reader/debug/debug.go Outdated Show resolved Hide resolved
@ph ph force-pushed the debug/large-buffer-issue-copy branch 2 times, most recently from 65ad07f to 452fa40 Compare November 23, 2018 17:27
@ph
Copy link
Contributor Author

ph commented Nov 23, 2018

@urso I think I have addressed all the issues in the PR, the detect_null_bytes can be turned on by using a selector, "*" or "detect_null_bytes".

libbeat/reader/debug/debug.go Outdated Show resolved Hide resolved
libbeat/reader/debug/debug.go Outdated Show resolved Hide resolved
libbeat/reader/debug/debug.go Outdated Show resolved Hide resolved
@ph
Copy link
Contributor Author

ph commented Nov 23, 2018

@urso updated with the latest comments.

@ph
Copy link
Contributor Author

ph commented Nov 23, 2018

Oh windows, I presume I am not correctly writing the null bytes on windows.

@urso
Copy link

urso commented Nov 23, 2018

LGTM

ph added 6 commits November 26, 2018 10:10
When you are reading logs on a network volume like NFS or ZFS it is
possible that the underlying filesystem return null bytes instead of
returning concrete data, its not currently possible to detect that in all
scenario unless events are eventually send to ES and you can inspect
them and see \u0000 chars in the messages.

This is a small proposal to add a Debug Reader which should only by used
for debugging purpose it allow to log if any null bytes are present in
the streams of bytes and will log surround values.

It accepts an io.Reader as the source of data, a buffer size, a
predicate to check the value of a byte and how much detection invokation it should do
before disabling the check.

Enable it with either of the following selectors: "*" or "detect_null_bytes"
@ph ph force-pushed the debug/large-buffer-issue-copy branch from 3449f3f to 253ff2a Compare November 26, 2018 15:11
@ph
Copy link
Contributor Author

ph commented Nov 26, 2018

I've fixed the windows issue and rebase the content of this PR, when its green I will merge it.
I will backport it to the 6.x branch, I think its a low risk and will improve our debugging options in some circumstances.

@ph ph added the needs_backport PR is waiting to be backported to other branches. label Nov 26, 2018
@ph ph merged commit 96c924a into elastic:master Nov 26, 2018
ph added a commit to ph/beats that referenced this pull request Nov 26, 2018
…c#9210)

When you are reading logs on a network volume like NFS or ZFS it is
possible that the underlying filesystem return null bytes instead of
returning concrete data, its not currently possible to detect that in all
scenario unless events are eventually send to ES and you can inspect
them and see \u0000 chars in the messages.

This is a small proposal to add a Debug Reader which should only by used
for debugging purpose it allow to log if any null bytes are present in
the streams of bytes and will log surround values.

It accepts an io.Reader as the source of data, a buffer size, a
predicate to check the value of a byte and how much detection invokation it should do
before disabling the check.

Enable it with either of the following selectors: "*" or "detect_null_bytes"

(cherry picked from commit 96c924a)
@ph ph added v6.6.0 and removed needs_backport PR is waiting to be backported to other branches. labels Nov 26, 2018
ph added a commit that referenced this pull request Nov 26, 2018
…from the io.Reader (#9236)

* Adding a Debug reader to detect Null Bytes from the io.Reader (#9210)

When you are reading logs on a network volume like NFS or ZFS it is
possible that the underlying filesystem return null bytes instead of
returning concrete data, its not currently possible to detect that in all
scenario unless events are eventually send to ES and you can inspect
them and see \u0000 chars in the messages.

This is a small proposal to add a Debug Reader which should only by used
for debugging purpose it allow to log if any null bytes are present in
the streams of bytes and will log surround values.

It accepts an io.Reader as the source of data, a buffer size, a
predicate to check the value of a byte and how much detection invokation it should do
before disabling the check.

Enable it with either of the following selectors: "*" or "detect_null_bytes"

(cherry picked from commit 96c924a)
@toddferg
Copy link
Contributor

@ph and @urso is there a way to resolve the null byte issue?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discuss Issue needs further discussion. Filebeat Filebeat libbeat review v6.6.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants