-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduce casting between []byte and string in CRI log processing #8424
Conversation
jareksm
commented
Sep 25, 2018
- Reduce casting from []byte to string and []byte again.
- If criflags are set to true, default to CRI log processing.
- Refactor CRI and json-file detection.
Since this is a community submitted pull request, a Jenkins build has not been kicked off automatically. Can an Elastic organization member please verify the contents of this patch and then kick off a build manually? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for opening this, it's looking good 🎉
Do you have any benchmarks? I'm wondering about the performance improvement
l := len(message.Content) | ||
if l > 0 && message.Content[l] == '\n' || message.Content[l] == '\r' { | ||
message.Content = message.Content[:l-1] | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this condition is not doing the same thing as TrimRightFunc, as it would not trim \r\n
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm on a 100% linux systems with no \r. Is that for windows? In this case the more optimal solution would be to determine if filebeat is running on windows or POSIX system and then trim the line correctly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this was more defensive code, I guess we can live with this and see if the problem is there for other implementations
@@ -144,7 +144,7 @@ func (p *DockerJSONReader) parseDockerJSONLog(message *reader.Message, msg *logL | |||
} | |||
|
|||
func (p *DockerJSONReader) parseLine(message *reader.Message, msg *logLine) error { | |||
if strings.HasPrefix(string(message.Content), "{") { | |||
if !p.criflags || len(message.Content) > 0 && message.Content[0] == '{' { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
criflags
is used to enable parsing CRI flags within the CRI format, mainly intended for backwards compatibility, but this can still be CRI with it set to false
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The assumption here is, if criflags are enabled, we are dealing with CRI log. Otherwise check the first byte if it is '{' to determine the log format. Although a much better solution would be to detect log format at the start of the file and process it as either json-file or CRI log without doing this test for every line in the log.
As for your initial question, I don't have any proper benchmarks comparing old code to new code. I do have micro benchmarks comparing specific changes. I didn't benchmark []byte to string and then to []byte casting, since I'm dealing with very large volume of multi-part messages, that's 2 extra memory allocations per every entry and it adds up very quickly. It saves on memory and GC.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer to avoid overloading this parameter and use something different to force the expected format. I plan to enable CRI flags parsing in our default manifests, as most users will benefit from them, but it should still work with docker logs.
I suggest we add a new force_format: docker-json | cri
parameter. It can also be done in a following PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a lot better option then making assumptions. I will update this code in the new few days.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@exekias After thinking this through, I've kept the current way of auto-detecting. But I've added force_cri_logs option to force CRI log parsing. Also I've moved new line removal to OS specific files, as such the correct function will be chosen at compile time.
* Refactor CRI and json-file detection
303e346
to
a1cb2cd
Compare
* Move new line removal to OS specific functions. * Add option to force CRI log parsing, otherwise autodetect log type.
* Fixed new line removal * Added unit tests to check Forced CRI flag
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for updating this, this is almost ready to go 🎉
Can you please add a CHANGELOG.asciidoc entry?, also add the new parameter to filebeat/docs/inputs/input-docker.asciidoc
jenkins, test this please |
I'll do it in my side, as I want to push this to 6.5 👍 |
Thank you so much for your contribution @jareksm 🎉 🌮 ! Keep them coming 😄 |
…ic#8424) * Reduce casting between []byte and string in CRI log processing * Refactor CRI and json-file detection * Improve handling of new lines in linux/windows CRI logs. * Move new line removal to OS specific functions. * Add option to force CRI log parsing, otherwise autodetect log type. * Fixed unit tests and new line removal * Fixed new line removal * Added unit tests to check Forced CRI flag (cherry picked from commit 3f7d6a6)
* Fix build for all OS after elastic#8424 * Apply PR comments
… CRI log processing (#8718) * Reduce casting between []byte and string in CRI log processing (#8424) * Reduce casting between []byte and string in CRI log processing * Refactor CRI and json-file detection * Improve handling of new lines in linux/windows CRI logs. * Move new line removal to OS specific functions. * Add option to force CRI log parsing, otherwise autodetect log type. * Fixed unit tests and new line removal * Fixed new line removal * Added unit tests to check Forced CRI flag (cherry picked from commit 3f7d6a6) * Add CHANGELOG and docs for CRI force flag (#8699) (cherry picked from commit 06ec3b9) * Fix conflicting paths * Fix build for all OS after #8424 (#8717) * Fix build for all OS after #8424 * Apply PR comments