Skip to content
This repository has been archived by the owner on May 25, 2022. It is now read-only.

recombine: do not combine if is_first_entry never matched #415

Closed
sumo-drosiek opened this issue Feb 25, 2022 · 3 comments · Fixed by #416
Closed

recombine: do not combine if is_first_entry never matched #415

sumo-drosiek opened this issue Feb 25, 2022 · 3 comments · Fixed by #416

Comments

@sumo-drosiek
Copy link
Member

In case is_first_entry do not matches any log line, it will recombine full file into one log.
I would expect that is_first_entry flushes every line separately until it matches first line

configuration:

receivers:
  filelog/containers:
    include:
    - *.log
    operators:
    - type: regex_parser
      regex: '^(?P<timestamp>[^\s]+) (?P<stream>\w+) (?P<logtag>\w) (?P<message>.*)'
    - combine_field: message
      is_first_entry: $$body.message matches "^\\d{4}-\\d{1,2}-\\d{1,2}.\\d{2}:\\d{2}:\\d{2}.*"
      type: recombine
    start_at: beginning

logs:

2016-10-06T00:17:09.669794202Z stdout F non-matching line
2016-10-06T00:17:09.669794202Z stdout F another non-matching line
2016-10-06T00:17:09.669794202Z stdout F 2016-10-06T00:17:09.669794202Z matching line

output (shortened):

otelcol_1  | LogRecord #0
otelcol_1  | Body: {
otelcol_1  |      -> message: STRING(non-matching line
otelcol_1  | another non-matching line)
otelcol_1  | }
otelcol_1  | 2022-02-25T11:18:37.837Z   DEBUG   loggingexporter/logging_exporter.go:79  ResourceLog #0
otelcol_1  | Body: {
otelcol_1  |      -> message: STRING(2016-10-06T00:17:09.669794202Z matching line)
otelcol_1  | }

I would rather expect that non-matching line and another non-matching line as they aren't continuation of any previous log

sumo-drosiek pushed a commit to SumoLogic/sumologic-kubernetes-collection that referenced this issue Feb 25, 2022
issue: open-telemetry/opentelemetry-log-collection#415
Signed-off-by: Dominik Rosiek <drosiek@sumologic.com>
sumo-drosiek pushed a commit to SumoLogic/sumologic-kubernetes-collection that referenced this issue Feb 25, 2022
issue: open-telemetry/opentelemetry-log-collection#415
Signed-off-by: Dominik Rosiek <drosiek@sumologic.com>
sumo-drosiek pushed a commit to SumoLogic/sumologic-kubernetes-collection that referenced this issue Feb 25, 2022
issue: open-telemetry/opentelemetry-log-collection#415
Signed-off-by: Dominik Rosiek <drosiek@sumologic.com>
@perk-sumo
Copy link

What we have now actually seems like a good behaviour to me 🤔
Why would you like to have those messages treated as separate logs?

@sumo-drosiek
Copy link
Member Author

This is an issue in k8s world or if you don't have full control over customers applications. If someone add file which uses different logging convention it will end with one big log.

I believe the is_first_entry should behave as I described.
is_last_entry should stay as it is

@sumo-drosiek
Copy link
Member Author

Fluent-bit issue about this behavior: fluent/fluent-bit#2585

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
2 participants