-
Notifications
You must be signed in to change notification settings - Fork 183
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
After upgrade Fluentd complains about some chunk is not in gzip format #1522
Comments
We are also seeing this on 2.0.5 using the helm chart. |
do you have any guesses, what might be the reason ? |
@Aaron-ML, @serhatcetinkaya I don't have specific idea for now, so I want to check if problematic files are really created after cleaning up the environment. |
Hello @sumo-drosiek , The issue happened again, I did fluent@collection-sumologic-1:/$ ls -laR /fluentd/
/fluentd/:
total 8
drwxr-xr-x 1 fluent fluent 20 Apr 13 23:39 .
drwxr-xr-x 1 root root 43 Apr 13 23:39 ..
drwxrwsr-x 16 root fluent 4096 Mar 22 09:12 buffer
drwxrwsrwx 3 root fluent 4096 Apr 11 08:47 etc
drwxr-xr-x 2 fluent fluent 6 Jan 6 13:06 log
drwxr-xr-x 2 fluent fluent 6 Jan 6 13:06 plugins
/fluentd/buffer:
total 144
drwxrwsr-x 16 root fluent 4096 Mar 22 09:12 .
drwxr-xr-x 1 fluent fluent 20 Apr 13 23:39 ..
drwxrwsr-x 2 fluent fluent 36864 Apr 14 07:04 logs.containers
drwxrwsr-x 2 fluent fluent 24576 Apr 14 07:04 logs.default
drwxrwsr-x 2 fluent fluent 24576 Mar 30 09:07 logs.kubelet
drwxrwsr-x 2 fluent fluent 4096 Apr 14 07:04 logs.systemd
drwxrws--- 2 root fluent 16384 Mar 22 09:12 lost+found
drwxrwsr-x 2 fluent fluent 4096 Mar 22 09:12 metrics.apiserver
drwxrwsr-x 2 fluent fluent 4096 Mar 22 09:12 metrics.container
drwxrwsr-x 2 fluent fluent 4096 Mar 22 09:12 metrics.control_plane
drwxrwsr-x 2 fluent fluent 4096 Mar 22 09:12 metrics.controller
drwxrwsr-x 2 fluent fluent 4096 Mar 22 09:12 metrics.default
drwxrwsr-x 2 fluent fluent 4096 Mar 22 09:12 metrics.kubelet
drwxrwsr-x 2 fluent fluent 4096 Mar 22 09:12 metrics.node
drwxrwsr-x 2 fluent fluent 4096 Mar 22 09:12 metrics.scheduler
drwxrwsr-x 2 fluent fluent 4096 Mar 22 09:12 metrics.state
/fluentd/buffer/logs.containers:
total 64
drwxrwsr-x 2 fluent fluent 36864 Apr 14 07:04 .
drwxrwsr-x 16 root fluent 4096 Mar 22 09:12 ..
-rw-r--r-- 1 fluent fluent 1852 Apr 14 07:04 buffer.b5bfe95cf7e20838aed9014d09980aba8.log
-rw-r--r-- 1 fluent fluent 219 Apr 14 07:04 buffer.b5bfe95cf7e20838aed9014d09980aba8.log.meta
-rw-r--r-- 1 fluent fluent 537 Apr 14 07:04 buffer.b5bfe95cf7e98edfbbb12d566b002de1a.log
-rw-r--r-- 1 fluent fluent 224 Apr 14 07:04 buffer.b5bfe95cf7e98edfbbb12d566b002de1a.log.meta
-rw-r--r-- 1 fluent fluent 1222 Apr 14 07:04 buffer.q5bfe95cca51f52703231ea1ccef32d66.log
-rw-r--r-- 1 fluent fluent 219 Apr 14 07:04 buffer.q5bfe95cca51f52703231ea1ccef32d66.log.meta
/fluentd/buffer/logs.default:
total 44
drwxrwsr-x 2 fluent fluent 24576 Apr 14 07:04 .
drwxrwsr-x 16 root fluent 4096 Mar 22 09:12 ..
-rw-r--r-- 1 fluent fluent 0 Apr 13 23:39 buffer.b5bfe325e05a7dd74410c7aa98830ed7d.log
-rw-r--r-- 1 fluent fluent 1478 Apr 14 07:04 buffer.b5bfe95cf7e6b122d55dbd64c4eb27e24.log
-rw-r--r-- 1 fluent fluent 108 Apr 14 07:04 buffer.b5bfe95cf7e6b122d55dbd64c4eb27e24.log.meta
-rw-r--r-- 1 fluent fluent 874 Apr 14 07:04 buffer.b5bfe95d06b2b1b66c5ab634826f9cfb4.log
-rw-r--r-- 1 fluent fluent 86 Apr 14 07:04 buffer.b5bfe95d06b2b1b66c5ab634826f9cfb4.log.meta
/fluentd/buffer/logs.kubelet:
total 28
drwxrwsr-x 2 fluent fluent 24576 Mar 30 09:07 .
drwxrwsr-x 16 root fluent 4096 Mar 22 09:12 ..
/fluentd/buffer/logs.systemd:
total 8
drwxrwsr-x 2 fluent fluent 4096 Apr 14 07:04 .
drwxrwsr-x 16 root fluent 4096 Mar 22 09:12 ..
/fluentd/buffer/lost+found:
total 20
drwxrws--- 2 root fluent 16384 Mar 22 09:12 .
drwxrwsr-x 16 root fluent 4096 Mar 22 09:12 ..
/fluentd/buffer/metrics.apiserver:
total 8
drwxrwsr-x 2 fluent fluent 4096 Mar 22 09:12 .
drwxrwsr-x 16 root fluent 4096 Mar 22 09:12 ..
/fluentd/buffer/metrics.container:
total 8
drwxrwsr-x 2 fluent fluent 4096 Mar 22 09:12 .
drwxrwsr-x 16 root fluent 4096 Mar 22 09:12 ..
/fluentd/buffer/metrics.control_plane:
total 8
drwxrwsr-x 2 fluent fluent 4096 Mar 22 09:12 .
drwxrwsr-x 16 root fluent 4096 Mar 22 09:12 ..
/fluentd/buffer/metrics.controller:
total 8
drwxrwsr-x 2 fluent fluent 4096 Mar 22 09:12 .
drwxrwsr-x 16 root fluent 4096 Mar 22 09:12 ..
/fluentd/buffer/metrics.default:
total 8
drwxrwsr-x 2 fluent fluent 4096 Mar 22 09:12 .
drwxrwsr-x 16 root fluent 4096 Mar 22 09:12 ..
/fluentd/buffer/metrics.kubelet:
total 8
drwxrwsr-x 2 fluent fluent 4096 Mar 22 09:12 .
drwxrwsr-x 16 root fluent 4096 Mar 22 09:12 ..
/fluentd/buffer/metrics.node:
total 8
drwxrwsr-x 2 fluent fluent 4096 Mar 22 09:12 .
drwxrwsr-x 16 root fluent 4096 Mar 22 09:12 ..
/fluentd/buffer/metrics.scheduler:
total 8
drwxrwsr-x 2 fluent fluent 4096 Mar 22 09:12 .
drwxrwsr-x 16 root fluent 4096 Mar 22 09:12 ..
/fluentd/buffer/metrics.state:
total 8
drwxrwsr-x 2 fluent fluent 4096 Mar 22 09:12 .
drwxrwsr-x 16 root fluent 4096 Mar 22 09:12 ..
/fluentd/etc:
total 8
drwxrwsrwx 3 root fluent 4096 Apr 11 08:47 .
drwxr-xr-x 1 fluent fluent 20 Apr 13 23:39 ..
drwxr-sr-x 2 root fluent 4096 Apr 11 08:47 ..2021_04_11_08_47_00.046418225
lrwxrwxrwx 1 root root 31 Apr 11 08:47 ..data -> ..2021_04_11_08_47_00.046418225
lrwxrwxrwx 1 root root 25 Apr 11 08:47 buffer.output.conf -> ..data/buffer.output.conf
lrwxrwxrwx 1 root root 18 Apr 11 08:47 common.conf -> ..data/common.conf
lrwxrwxrwx 1 root root 18 Apr 11 08:47 fluent.conf -> ..data/fluent.conf
lrwxrwxrwx 1 root root 16 Apr 11 08:47 logs.conf -> ..data/logs.conf
lrwxrwxrwx 1 root root 44 Apr 11 08:47 logs.enhance.k8s.metadata.filter.conf -> ..data/logs.enhance.k8s.metadata.filter.conf
lrwxrwxrwx 1 root root 43 Apr 11 08:47 logs.kubernetes.metadata.filter.conf -> ..data/logs.kubernetes.metadata.filter.conf
lrwxrwxrwx 1 root root 44 Apr 11 08:47 logs.kubernetes.sumologic.filter.conf -> ..data/logs.kubernetes.sumologic.filter.conf
lrwxrwxrwx 1 root root 23 Apr 11 08:47 logs.output.conf -> ..data/logs.output.conf
lrwxrwxrwx 1 root root 34 Apr 11 08:47 logs.source.containers.conf -> ..data/logs.source.containers.conf
lrwxrwxrwx 1 root root 31 Apr 11 08:47 logs.source.default.conf -> ..data/logs.source.default.conf
lrwxrwxrwx 1 root root 32 Apr 11 08:47 logs.source.sys-logs.conf -> ..data/logs.source.sys-logs.conf
lrwxrwxrwx 1 root root 31 Apr 11 08:47 logs.source.systemd.conf -> ..data/logs.source.systemd.conf
lrwxrwxrwx 1 root root 19 Apr 11 08:47 metrics.conf -> ..data/metrics.conf
lrwxrwxrwx 1 root root 26 Apr 11 08:47 metrics.output.conf -> ..data/metrics.output.conf
/fluentd/etc/..2021_04_11_08_47_00.046418225:
total 68
drwxr-sr-x 2 root fluent 4096 Apr 11 08:47 .
drwxrwsrwx 3 root fluent 4096 Apr 11 08:47 ..
-rw-r--r-- 1 root fluent 215 Apr 11 08:47 buffer.output.conf
-rw-r--r-- 1 root fluent 371 Apr 11 08:47 common.conf
-rw-r--r-- 1 root fluent 61 Apr 11 08:47 fluent.conf
-rw-r--r-- 1 root fluent 202 Apr 11 08:47 logs.conf
-rw-r--r-- 1 root fluent 246 Apr 11 08:47 logs.enhance.k8s.metadata.filter.conf
-rw-r--r-- 1 root fluent 321 Apr 11 08:47 logs.kubernetes.metadata.filter.conf
-rw-r--r-- 1 root fluent 262 Apr 11 08:47 logs.kubernetes.sumologic.filter.conf
-rw-r--r-- 1 root fluent 220 Apr 11 08:47 logs.output.conf
-rw-r--r-- 1 root fluent 2574 Apr 11 08:47 logs.source.containers.conf
-rw-r--r-- 1 root fluent 698 Apr 11 08:47 logs.source.default.conf
-rw-r--r-- 1 root fluent 1004 Apr 11 08:47 logs.source.sys-logs.conf
-rw-r--r-- 1 root fluent 1637 Apr 11 08:47 logs.source.systemd.conf
-rw-r--r-- 1 root fluent 4501 Apr 11 08:47 metrics.conf
-rw-r--r-- 1 root fluent 122 Apr 11 08:47 metrics.output.conf
/fluentd/log:
total 0
drwxr-xr-x 2 fluent fluent 6 Jan 6 13:06 .
drwxr-xr-x 1 fluent fluent 20 Apr 13 23:39 ..
/fluentd/plugins:
total 0
drwxr-xr-x 2 fluent fluent 6 Jan 6 13:06 .
drwxr-xr-x 1 fluent fluent 20 Apr 13 23:39 .. also a similar but more compact output: fluent@collection-sumologic-1:/$ du -a /fluentd/
4 /fluentd/etc/..2021_04_11_08_47_00.046418225/logs.conf
4 /fluentd/etc/..2021_04_11_08_47_00.046418225/logs.kubernetes.metadata.filter.conf
4 /fluentd/etc/..2021_04_11_08_47_00.046418225/buffer.output.conf
4 /fluentd/etc/..2021_04_11_08_47_00.046418225/logs.output.conf
4 /fluentd/etc/..2021_04_11_08_47_00.046418225/common.conf
4 /fluentd/etc/..2021_04_11_08_47_00.046418225/logs.enhance.k8s.metadata.filter.conf
4 /fluentd/etc/..2021_04_11_08_47_00.046418225/fluent.conf
4 /fluentd/etc/..2021_04_11_08_47_00.046418225/metrics.output.conf
4 /fluentd/etc/..2021_04_11_08_47_00.046418225/logs.source.sys-logs.conf
4 /fluentd/etc/..2021_04_11_08_47_00.046418225/logs.source.default.conf
8 /fluentd/etc/..2021_04_11_08_47_00.046418225/metrics.conf
4 /fluentd/etc/..2021_04_11_08_47_00.046418225/logs.source.containers.conf
4 /fluentd/etc/..2021_04_11_08_47_00.046418225/logs.kubernetes.sumologic.filter.conf
4 /fluentd/etc/..2021_04_11_08_47_00.046418225/logs.source.systemd.conf
64 /fluentd/etc/..2021_04_11_08_47_00.046418225
0 /fluentd/etc/logs.kubernetes.sumologic.filter.conf
0 /fluentd/etc/logs.source.systemd.conf
0 /fluentd/etc/buffer.output.conf
0 /fluentd/etc/logs.output.conf
0 /fluentd/etc/common.conf
0 /fluentd/etc/logs.conf
0 /fluentd/etc/logs.kubernetes.metadata.filter.conf
0 /fluentd/etc/logs.enhance.k8s.metadata.filter.conf
0 /fluentd/etc/fluent.conf
0 /fluentd/etc/logs.source.default.conf
0 /fluentd/etc/metrics.conf
0 /fluentd/etc/logs.source.containers.conf
0 /fluentd/etc/metrics.output.conf
0 /fluentd/etc/logs.source.sys-logs.conf
0 /fluentd/etc/..data
68 /fluentd/etc
0 /fluentd/log
0 /fluentd/plugins
4 /fluentd/buffer/metrics.node
4 /fluentd/buffer/metrics.container
16 /fluentd/buffer/lost+found
4 /fluentd/buffer/metrics.scheduler
4 /fluentd/buffer/logs.containers/buffer.b5bfe95b99251f986f3f32ac056069812.log.meta
4 /fluentd/buffer/logs.containers/buffer.b5bfe95b99251f986f3f32ac056069812.log
4 /fluentd/buffer/logs.containers/buffer.q5bfe95b6b5ed23c0a2c269fcd13f785d.log
4 /fluentd/buffer/logs.containers/buffer.q5bfe95b6b5ed23c0a2c269fcd13f785d.log.meta
52 /fluentd/buffer/logs.containers
4 /fluentd/buffer/metrics.kubelet
4 /fluentd/buffer/metrics.state
4 /fluentd/buffer/logs.systemd
24 /fluentd/buffer/logs.kubelet
4 /fluentd/buffer/metrics.controller
4 /fluentd/buffer/metrics.apiserver
4 /fluentd/buffer/metrics.control_plane
4 /fluentd/buffer/logs.default/buffer.b5bfe95b678969fcf27c8a8e7a3249509.log.meta
4 /fluentd/buffer/logs.default/buffer.q5bfe95b43968588be4b848764fb81485.log
4 /fluentd/buffer/logs.default/buffer.b5bfe95b678969fcf27c8a8e7a3249509.log
4 /fluentd/buffer/logs.default/buffer.q5bfe95b43968588be4b848764fb81485.log.meta
0 /fluentd/buffer/logs.default/buffer.b5bfe325e05a7dd74410c7aa98830ed7d.log
40 /fluentd/buffer/logs.default
4 /fluentd/buffer/metrics.default
176 /fluentd/buffer
244 /fluentd/ I am also attaching logs from the container if you wish to take a look |
@serhatcetinkaya, One more question. How much time after restart, the issue occurred? |
@sumo-drosiek we have the application deployed in 10 different kubernetes clusters, so far it seems there isn't any pattern among these issues. some of them had the issue after hours of restart, some of them had it after days. |
@serhatcetinkaya Meantime I will try to reproduce the issue |
Thank you @sumo-drosiek , I am beginning to update image on our side 👍 |
We got hit by this as well: |
@djsly, @serhatcetinkaya In case you would like to help us with issue, we prepared fluentd image based on
We could also move forward with investigation if you could provide us stacktrace of exception from fluentd, which we understand is not easy due to overflow and file rotation. To disable suppression of stacktraces by fluentd you can add
Note: This can generate a lot of additional logs, which can result in exceeding limits in sumo and even to make cluster unhealthy. By design warn logs shouldn't be forwarded to sumo, but please be aware that it can impact on overall collection behavior To prevent processing fluentd logs, please add following filter to fluent-bit configuration in
|
Hi @sumo-drosiek , After doing this change we haven't seen the error in any of our environments since April 19. I was going to update the discussion but I wanted to wait until the end of the month to make sure it solved the problem for sure. Just to let you know we are fine now. I will monitor things for a couple days more and update this discussion again. Thanks for the help |
@serhatcetinkaya Thank you for the update @djsly just to ensure. Did you follow troubleshooting guide? |
I will close this ticket the issue is resolved by us |
Describe the bug
We recently migrated to new structure of sumologic collection in kubernetes. After the migration, we started seeing an error on fluentd which says some chunk is not in gzip format. I checked troubleshooting guide and I restarted every pod while deleting/recreating their pvc. The problem is still there, can you help us find the cause of this ?
Logs
Command used to install/upgrade Collection
We are non-helm users.
Configuration
To Reproduce
We are not able to reproduce.
Expected behavior
Not to see this error.
Environment (please complete the following information):
helm ls -n sumologic
):public.ecr.aws/sumologic/kubernetes-fluentd:1.12.0-sumo-1
kubectl version
):v1.18.9-eks-d1db3c
Anything else do we need to know
I did kubectl exec to the fluentd container and found following for the problematic files, in this example
buffer.b5bde2b7225ad70f7d1eeb473af979652.log
andbuffer.b5bde2d0693d05a4d7dd397e388bbac72.log
are problematic:Also I am not sure if this is a coincidence but we always see problematic chunks under
/fluentd/buffer/logs.default
directory.The text was updated successfully, but these errors were encountered: