-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
parser_json: Add stream_buffer_size config param #2381
parser_json: Add stream_buffer_size config param #2381
Conversation
See fluent/fluentd#2381 In addition to documenting the parameter, update the `in_exec` and `out_exec_filter` articles to provide a workaround for users encountering this problem. This note exists in the v0.12 version of the `in_exec` page but was absent in the v1.0 docs, so add the whole note. Closes fluent#624 Signed-off-by: Ben Wheatley <contact@benwh.com>
See fluent/fluentd#2381 In addition to documenting the parameter, update the `in_exec` and `out_exec_filter` articles to provide a workaround for users encountering this problem. This note exists in the v0.12 version of the `in_exec` page but was absent in the v1.0 docs, so add the whole note. Closes fluent#624 Signed-off-by: Ben Wheatley <contact@benwh.com>
@@ -95,6 +95,7 @@ class ExecFilterOutput < Output | |||
COMPAT_PARSE_PARAMS = { | |||
'out_format' => '@type', | |||
'out_keys' => 'keys', | |||
'out_stream_buffer_size' => 'stream_buffer_size', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only reason that this was added was to shave another 30s off the tests
test/plugin/test_parser_json.rb
Outdated
@@ -111,4 +111,29 @@ def test_parse_with_keep_time_key_without_time_format(data) | |||
assert_equal text, record['time'] | |||
end | |||
end | |||
|
|||
data('yajl' => 'yajl') | |||
def test_yajl_parse_io_with_buffer_smaller_than_input(data) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was considering adding a test case for the inverse of this (input smaller than buffer) but didn't do this because it seemed like it would be quite awkward to implement - due to needing to assert on the parser not emitting an event - and probably not provide much value.
Happy to reconsider this if anyone has a suggestion on how this could be done neatly though!
lib/fluent/plugin/parser_json.rb
Outdated
# The Yajl library defines a default buffer size of 8092 when parsing | ||
# from IO streams, so maintain this for backwards-compatibility. | ||
# https://www.rubydoc.info/github/brianmario/yajl-ruby/Yajl%2FParser:parse | ||
config_param :stream_buffer_size, :integer, default: 8092 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add :desc
parameter to explain the parameter. This is only for yajl
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you use desc
method, not :desc
parameter, it should be put on top of config_param
.
Currently, desc 'Set the buffer size that Yajl will use when parsing streaming input'
is for time_tupe
, not stream_buffer_size
.
See https://docs.fluentd.org/v1.0/articles/api-plugin-base
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, that was a silly mistake, thanks for the pointer! Fixed.
test/plugin/test_parser_json.rb
Outdated
@@ -111,4 +111,29 @@ def test_parse_with_keep_time_key_without_time_format(data) | |||
assert_equal text, record['time'] | |||
end | |||
end | |||
|
|||
data('yajl' => 'yajl') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't use data
for only 1 pattern.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, sorted :)
e6ef491
to
09ae8bc
Compare
Hi @repeatedly, those issues are fixed so could you take another look when you get some time? Thanks! |
09ae8bc
to
9f9a631
Compare
lib/fluent/plugin/parser_json.rb
Outdated
@@ -30,6 +30,12 @@ class JSONParser < Parser | |||
desc 'Set JSON parser' | |||
config_param :json_parser, :enum, list: [:oj, :yajl, :json], default: :oj | |||
|
|||
# The Yajl library defines a default buffer size of 8092 when parsing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nits] It may be better a default buffer size is 8192 because of brianmario/yajl-ruby@9d4cca0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Definitely not a nit; thanks for picking up on this! I did think that 8092 bytes was a bit of a strange size for a buffer, but didn't look into it 🤦♂️.
Fixed and I'll update the docs PR too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM although I left a few nits comments.
) | ||
text = "100" | ||
|
||
waiting(5) do |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it need? this test suit seems not to block since L134.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes it's not actually required for the test to work, I put it there to protect against any regression in the code being tested (or the test scaffolding) that might cause the test to fail, because if it does then it will block forever which would be quite annoying when it runs in CI, if nothing else.
9f9a631
to
34567ee
Compare
Allow configuration of the size of the buffer that Yajl uses when parsing streaming input. The advantage of this is that when using `out_exec_filter`, and parsing as JSON, it's now possible to configure this plugin to avoid having to wait for 8192 bytes of data to be parsed before events are emitted. Configuration in the `out_exec_filter` tests has been modified to use this parameter, as it shaves 60 seconds off the test run time. Signed-off-by: Ben Wheatley <contact@benwh.com>
34567ee
to
811390d
Compare
See fluent/fluentd#2381 In addition to documenting the parameter, update the `in_exec` and `out_exec_filter` articles to provide a workaround for users encountering this problem. This note exists in the v0.12 version of the `in_exec` page but was absent in the v1.0 docs, so add the whole note. Closes fluent#624 Signed-off-by: Ben Wheatley <contact@benwh.com>
Thx! |
Which issue(s) this PR fixes:
Fixes #609
What this PR does / why we need it:
Allow configuration of the size of the buffer that Yajl uses when parsing streaming input.
The advantage of this is that when using
out_exec_filter
, and parsingas JSON, it's now possible to configure this plugin to avoid having to
wait for 8192 bytes of data to be parsed before events are emitted.
Docs Changes: fluent/fluentd-docs#631
Release Note:
Use PR title.