-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix filebeat elasticsearch module ingest timezone #13367
Conversation
Since this is a community submitted pull request, a Jenkins build has not been kicked off automatically. Can an Elastic organization member please verify the contents of this patch and then kick off a build manually? |
1 similar comment
Since this is a community submitted pull request, a Jenkins build has not been kicked off automatically. Can an Elastic organization member please verify the contents of this patch and then kick off a build manually? |
ok to test |
I think it doesn't fix the issue for plain text logs :(
@pragkent I was testing this branch with some of the test files we have and I have seen that for this log line:
The timestamp with timezone
But it should be:
I think that the same happens for any log without timestamp. Something we could evaluate is to have different date processors for json and plain text files, as there are different pipelines for any of them. |
I'm confused about the behavior of the date processor. I checked the source code and found it will use the last pattern that matches the time string instead of the first one. This is the reason why the timezone is messed up. The json log works fine just because timezone is included in the log line. If we remove the |
This can be an option, but then I guess that we should ignore errors in this processor because timestamps parsed in the previous In any case as the data format in JSON and plain text logs are different, what do you think about moving date parsing to the specific pipelines, |
yes, I think you are right. |
I've moved these out to specific pipeline. When the log is in JSON format, timezone is already included, so I deleted Please have a look. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
@@ -294,6 +295,7 @@ https://github.com/elastic/beats/compare/v7.0.0-alpha2...master[Check the HEAD d | |||
- Update PAN-OS fileset to use the ECS NAT fields. {issue}13320[13320] {pull}13330[13330] | |||
- Add fields to the Zeek DNS fileset for ECS DNS. {issue}13320[13320] {pull}13324[13324] | |||
- Add container image in Kubernetes metadata {pull}13356[13356] {issue}12688[12688] | |||
- Add timezone information to apache error fileset. {issue}12772[12772] {pull}13304[13304] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this line be removed from this PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, my fault, I did the merge, let me fix it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I see, I wanted to move the changelog entry 🙂
"formats": [ | ||
"yyyy-MM-dd'T'HH:mm:ss,SSS" | ||
], | ||
"ignore_failure": true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that this processor and the next one are both acting on the same field
, perhaps we could add a "if": "ctx.event.timezone == null"
property to this processor for a bit of extra clarity?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's better to not add "if": "ctx.event.timezone == null"
, because:
- other modules don't have
"if": "ctx.event.timezone == null"
added, we'd better keep this convention - in most cases,
event.timezone
was not set. without"if": "ctx.event.timezone == null"
would make the second processor looks like aoptional
path instead of aalternative
path
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think @ycombinator is right, we could add this if
, as only one of the processors is needed.
- other modules don't have
"if": "ctx.event.timezone == null"
added, we'd better keep this convention
In previous versions both processors were meant to be executed, one to parse the date, and the next one to apply the timezone. We (well, you @pragkent 🙂 ) found this approach is not correct in many cases, so now we are duplicating the date processor, first option only parses the date, and second option parses the date with a timezone if available.
- in most cases,
event.timezone
was not set. without"if": "ctx.event.timezone == null"
would make the second processor looks like aoptional
path instead of aalternative
path
Actually we are setting event.timezone
by default since ~7.2 in all filebeat events, so in most cases both date processors are executed even if only the last one is needed.
Said that, as we had it quite tested by both @pragkent and me, and it solves an existing issue, I'd go on with merging this change as is, and have a follow up PR to review the conditions in the pipelines we have recently changed to fix this same issue. This way we keep a common convention for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
(cherry picked from commit 31e9a1e)
(cherry picked from commit 31e9a1e)
Hello,
|
…astic#13494) (cherry picked from commit 871ce17) Co-authored-by: Kent Wang <pragkent@gmail.com>
…astic#13495) (cherry picked from commit 871ce17) Co-authored-by: Kent Wang <pragkent@gmail.com>
This pull request fixes timezone parsing for elasticsearch module.
Just like #13308 fixes ingest timezone parsing for system module.