[chore][pkg/stanza/fileconsumer] Emit logs in batches #36276
Closed
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
Modifies the File consumer to emit logs in batches as opposed to sending each log individually through the Stanza pipeline and on to the Log Emitter.
This was achieved via the following incremental changes, with the goal of making code reviews easier (multiple smaller changesets):
Reader::ReadToEnd
method in File consumer, but still callemit
function for each token individuallyemit.Callback
function signature to accept a slice of tokens and emit tokens in batches from theReader
. At this point, the batches are still split into individual tokens inside theemit
function, because the Stanza operators can only process one entry at a time.ProcessBatch
method to Stanza operators and use it in theemit
function. At this point, the batch of tokens is translated to a batch of entries and passed to Log Emitter as a whole. The batch is still split in the Log Emitter, which callsconsumeFunc
for each entry in a loop.consumeFunc
on the whole batch of entriesNote that this is currently a draft, requesting initial feedback. I haven't yet implemented the
ProcessBatch
method for all Stanza operators, as I'd like to first get feedback its definition. Specifically, should the function accept a[]entry.Entry
or[]*entry.Entry
?Link to tracking issue
Testing
No changes in tests. The goal is for the functionality to not change and for performance to not decrease.
Documentation
These are internal changes, no user documentation needs changing.