Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nested JSON map parsing not working #1409

Closed
rafaelmagu opened this issue Jun 28, 2019 · 10 comments
Closed

Nested JSON map parsing not working #1409

rafaelmagu opened this issue Jun 28, 2019 · 10 comments

Comments

@rafaelmagu
Copy link

Bug Report

Describe the bug
Nested JSON maps in a Kubernetes service's stdout log do not get parsed in 1.2.0.

To Reproduce
I'm using the Helm chart for Fluent Bit. Here is fluent-bit-config ConfigMap:

Name:         fluent-bit-config
Namespace:    platform
Labels:       app=fluent-bit
              chart=fluent-bit-1.9.0
              heritage=Tiller
              release=fluent-bit
Annotations:  <none>

Data
====
fluent-bit-output.conf:
----

[OUTPUT]
    Name          forward
    Match         *
    Host          <obfuscated>
    Port          24224
    Retry_Limit False
    Shared_Key    <obfuscated>

    Self_Hostname fluent-bit
    tls           On
    tls.verify    Off
fluent-bit-service.conf:
----
[SERVICE]
    Flush        1
    Daemon       Off
    Log_Level    info
    Parsers_File parsers.conf
    Parsers_File parsers_custom.conf
fluent-bit.conf:
----
@INCLUDE fluent-bit-service.conf
@INCLUDE fluent-bit-input.conf

[FILTER]
    Name lua
    Match kube.*
    script /fluent-bit/etc/helpers.lua
    call process

@INCLUDE fluent-bit-filter.conf

[FILTER]
    Name         modify
    Match        *
    Add          log_source kubernetes
    Add          environment staging
    Remove_regex ^kubernetes\.annotations\..*

@INCLUDE fluent-bit-output.conf
parsers.conf:
----
[PARSER]
    Name        first_line
    Format      regex
    Regex       ^{"log":"(?!\\u0009)(?<log>\S(?:(\\")|[^"]){9}(?:(\\")|[^"])*)"
fluent-bit-filter.conf:
----
[FILTER]
    Name                kubernetes
    Match               kube.*
    Kube_URL            https://kubernetes.default.svc:443
    Kube_CA_File        /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
    Kube_Token_File     /var/run/secrets/kubernetes.io/serviceaccount/token
    Merge_Log           On
    K8S-Logging.Parser  On
    K8S-Logging.Exclude On

fluent-bit-input.conf:
----
[INPUT]
    Name             tail
    Path             /var/log/containers/*.log
    Parser           docker
    Tag              kube.*
    Refresh_Interval 5
    Mem_Buf_Limit    5MB
    Skip_Long_Lines  On
    Multiline        On
    Multiline_Flush  5
    Parser_Firstline first_line
Events:  <none>

Original log message from container stdout:

{"@timestamp":"2019-06-28T00:01:23.500Z","@version":"1","message":"172.22.10.250 - - [2019-06-28T00:01:23.500Z] \"GET /healthcheck HTTP/1.1\" 200 -1","method":"GET","protocol":"HTTP/1.1","status_code":200,"requested_url":"GET /healthcheck HTTP/1.1","requested_uri":"/healthcheck","remote_host":"172.22.10.250","remote_user":"-","content_length":-1,"elapsed_time":1}

Object as displayed in Kibana:

{\"@timestamp\":\"2019-06-28T00:01:23.500Z\",\"@version\":\"1\",\"message\":\"172.22.10.250 - - [2019-06-28T00:01:23.500Z] \\\"GET /healthcheck HTTP/1.1\\\" 200 -1\",\"method\":\"GET\",\"protocol\":\"HTTP/1.1\",\"status_code\":200,\"requested_url\":\"GET /healthcheck HTTP/1.1\",\"requested_uri\":\"/healthcheck\",\"remote_host\":\"172.22.10.250\",\"remote_user\":\"-\",\"content_length\":-1,\"elapsed_time\":1}

Expected behavior
I expected the nested JSON map to be parsed correctly, and the keys available as indexed fields in Elasticsearch.

Your Environment

  • Version used: 1.2.0
  • Configuration: Fluent Bit collectors running as DaemonSet shipping logs to a remote Fluentd listener/aggregator.
  • Environment name and version (e.g. Kubernetes? What version?): Kubernetes 1.12 on AWS EKS
  • Operating System and version: Ubuntu 18.04
  • Filters and plugins: lua, modify, kubernetes

Additional context
This was mostly working with edsiper/fluent-bit-1.2-next:5, except for the fact that one field wasn't fully parsed (note the extra backslashes):

172.22.10.250 - - [2019-06-28T00:01:23.500Z] \"GET /healthcheck HTTP/1.1\" 200 -1

This has been described in #1278 (comment)

I'm using a custom Docker image built on top of 1.2.0:

FROM fluent/fluent-bit:1.2.0

ADD JSON.lua /fluent-bit/etc/JSON.lua
ADD helpers.lua /fluent-bit/etc/helpers.lua

The files mentioned are detailed here: #337 (comment)

@rafaelmagu
Copy link
Author

I tried to rollback to edsiper/fluent-bit-1.2-next:5 and that didn't work. Rolling back to fluent/fluent-bit:1.1.2 makes it work better (but no parsing of message key).

@edsiper
Copy link
Member

edsiper commented Jun 28, 2019 via email

@edsiper
Copy link
Member

edsiper commented Jul 3, 2019

Closing this ticket, the solution is to use Decoder in your parser but applied to message key.

@rafaelmagu
Copy link
Author

Sorry @edsiper but this isn't resolved. While I can add the decoder to the json parser to parse message field, the enclosing map isn't being parsed. The whole log field is not parsed.

I will retrace my steps again to confirm one more time.

@rafaelmagu
Copy link
Author

rafaelmagu commented Jul 5, 2019

Re-tried with fluent/fluent-bit:1.2.0 and now the log field of the event isn't being parsed at all:

image

@servo1x
Copy link

servo1x commented Jul 30, 2019

Hi @rafaelmagu, is this working better with 1.2.2?

@rafaelmagu
Copy link
Author

Hi @servo1x, it does not work for me. I have settled on fluent/fluent-bit:1.1.2 image instead.

@ZhangSIming-blyq
Copy link

Does this issue have a solution? same issue here.

@rimantoro
Copy link

Hi,

I do have this issue with 1.3 docker image. This is from fluentbit container log

stdout

[0] lucent_svc.local: [1578975835.000000000, {"source"=>"stderr", "log"=>"{"level":"info","ts":1578975835.8050892,"caller":"middleware/echo-middleware.go:222","msg":"success","log":{"error":{"Key":"","Type":26,"Integer":0,"String":"","Interface":null},"host":"0.0.0.0:8098","human_latency":"244.7046ms","id":"WBqZZtNemtoAyeEI6xPIQw96hn7aew4T","latency":244704600,"method":"POST","path":"/api/v1/login","protocol":"HTTP/1.1","query_params":"","remote_ip":"172.18.0.1","request":{"identifier":"xxxxxxx","password":"xxxxxxx"},"response":{"message":"success","payload":{"expires_in":60,"token":"myjwttoken"},"rc":"00","sign":"mysignature"},"status":200,"time":1578975835805046300,"uri":"/api/v1/login","user_agent":"insomnia/7.0.6"}}", "container_id"=>"1fb2bc8dbf7338842b226686937a4663ac509827e52c6ead418844c84b5ecc80", "container_name"=>"/api.lucent.loc"}]

Value for log ("log" => ".....) is string escaped and not in structured json. I'm using elasticsearch and cannot mapped in correct json format.

This is my parsers.conf

[PARSER]
    Name        docker
    Format      json
    # Command      |  Decoder | Field | Optional Action
    # =============|==================|=================
    Decode_Field_As   json    log

@edsiper
Copy link
Member

edsiper commented Jan 14, 2020

If you are using Fluent Bit >= 1.2 you have to get rid of the decoder:

https://docs.fluentbit.io/manual/installation/upgrade_notes#fluent-bit-v-1-2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants