-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consider timezone when calculate timekey #2054
Conversation
Because the timestamp of the midnight cannot be divisible by 86400 in localtime (JST). The timestamp of the midnight is divisible by 86400 only in UTC. Therefore we must consider offset from UTC in localtime. For example, at `2018-07-04 01:23:23 +0900`: If timekey is 86400 and path template is `/log/%Y%m%d.log`. In previous version, extract path template to `/log/20180703.log`. In this version, extract path template to `/log/20180704.log`. Fix fluent#1986 Signed-off-by: Kenji Okimoto <okimoto@clear-code.com>
Test failed. Could you check it? |
Signed-off-by: Kenji Okimoto <okimoto@clear-code.com>
Signed-off-by: Kenji Okimoto <okimoto@clear-code.com>
Signed-off-by: Kenji Okimoto <okimoto@clear-code.com>
append true | ||
<buffer> | ||
timekey_use_utc false | ||
timekey_zone Asia/Tokyo |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you use "+0900" like value?
I'm not sure but test on Windows fails.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just commit the fix :)
lib/fluent/plugin/output.rb
Outdated
if @buffer_config.timekey_use_utc | ||
(time_int - (time_int % @buffer_config.timekey)).to_i | ||
else | ||
offset = Time.at(time_int).utc_offset |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should consider timekey_zone
value instead of utc_offset
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe yes. How about the following function?
diff --git a/lib/fluent/timezone.rb b/lib/fluent/timezone.rb
index bb6fa868..0e79dd03 100644
--- a/lib/fluent/timezone.rb
+++ b/lib/fluent/timezone.rb
@@ -139,5 +139,17 @@ module Fluent
return nil
end
+
+ def self.utc_offset(time, timezone)
+ return 0 if timezone.nil?
+
+ case timezone
+ when NUMERIC_PATTERN
+ Time.zone_offset(timezone)
+ when NAME_PATTERN
+ tz = TZInfo::Timezone.get(timezone)
+ tz.period_for_utc(time).utc_total_offset
+ end
+ end
end
end
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good
lib/fluent/plugin/output.rb
Outdated
@@ -825,6 +822,16 @@ def metadata(tag, time, record) | |||
end | |||
end | |||
|
|||
def calculate_timekey(time) | |||
time_int = time.to_i | |||
if @buffer_config.timekey_use_utc |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
accessing object member is slower than accessing instance variable.
@timekey_use_utc = @buffer_config.timekey_use_utc
in configure
is better.
This can improve performance. Signed-off-by: Kenji Okimoto <okimoto@clear-code.com>
…one` Signed-off-by: Kenji Okimoto <okimoto@clear-code.com>
lib/fluent/plugin/output.rb
Outdated
if @timekey_use_utc | ||
(time_int - (time_int % @timekey)).to_i | ||
else | ||
offset = Fluent::Timezone.utc_offset(time, @timekey_zone) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Check @timekey_zone
format everytime is slower.
This code is called in each emit so need to avoid extra cost.
Cache the result is better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, we should consider DST. The naive cache is not good for DST.
What should we do?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see.
One way is generating calculate_timekey
method in configure
.
For numeric pattern, the offset value can be embbed directly.
For name pattern, embed getting tz data in the body.
How about this?
@@ -557,6 +557,42 @@ def parse_system(text) | |||
check_gzipped_result(path, formatted_lines * 3) | |||
end | |||
|
|||
test 'append when JST' do |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add more test with different time and timezone, e.g. timekey_zone
is +09:00
and actual time is +02:00
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I will add more test.
Signed-off-by: Kenji Okimoto <okimoto@clear-code.com>
For `NUMERIC_PATTERN`, just return static value. For `NAME_PATTERN`, return lambda to calculate offset later. This version is 5-10% faster than the previous version. Signed-off-by: Kenji Okimoto <okimoto@clear-code.com>
Thanks! |
I released v1.2.4.rc1 for testing this patch. |
Because the timestamp of the midnight cannot be divisible by 86400 in
localtime (JST). The timestamp of the midnight is divisible by 86400
only in UTC. Therefore we must consider offset from UTC in localtime.
For example, at
2018-07-04 01:23:23 +0900
:If timekey is 86400 and path template is
/log/%Y%m%d.log
.In previous version, extract path template to
/log/20180703.log
.In this version, extract path template to
/log/20180704.log
.Fix #1986