-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make week number parsing ISO8601 compliant for weekstart=1 #81
Conversation
To least surprise any users that rely on week parsing, we should probably modify Taskwarrior to pass its Anyone that wants Monday weeks will probably prefer the new behavior. I would have favored just always using weekstart = 1, but wanted to leave a path available for people that want Sunday weeks, so I made it work the old way if weekstart was set to 0. For example, GothenburgBitFactory/timewarrior#23 shows that at least some people want this. |
Also, once this is reviewed, I can make patches that update documentation for this (maybe change in behavior should go in NEWS) and change the libshared submodule peg in both projects, if desired. |
er, I had tested this with |
it looks like Timewarrior has duplicates of some functions from libshared, i.e. its @lauft do you know why Timewarrior has its own copy despite already using libshared as a submodule? |
@smemsh I will have to look into the code again for the exact details, but I think the reason was that I needed a dedicated function to parse date ranges. Another idea was to transfer all parsing from |
@lauft ok, well my changes update only:
Of these, I compared with a script and see only the following differences:
It looks like the second one is not substantive. The first one, looks like it came from 7dffb13, sole change, commit message:
Your original commit which brought these in was GothenburgBitFactory/timewarrior@63f23001. So if all these functions are the same, it would be a shame to duplicate it, but I can do that until such time as either of these things you mention comes to fruition (it's already duplicated, so it wouldn't be a regression per se). I can't really implement either of your ideas myself unfortunately, as I am only familiar with C, not C++ and it's a bit beyond what I could do easily. Those issues aside, what do you think of my proposed changes? |
@smemsh I think it is good to support ISO weeks properly, also with respect to GothenburgBitFactory/timewarrior#23 (as you pointed out). Could your changes also cover weirder cases like |
No, my PR is limited in scope. It fixes only week number parsing, and expects weekstart to be either 0 (old behavior), or 1 (new ISO8601-compliant behavior). It does not change I left the old algorithm in place if either Taskwarrior or Timewarrior wants to pass in Accommodating GothenburgBitFactory/timewarrior#23 would require more than just passing in weekstart, it would require changing all the relative week shortcuts, which currently are Monday-based. As I already use Monday myself, that's beyond what I wanted to attempt. I will duplicate my |
addcf34
to
67482e9
Compare
Whilst working on a week parsing issue, while reviewing the Zeller's Congruence implementation that determines day of week, some questions came up, such as: - in all the literature I could find, the algorithms were using a term like "(13 * (m + 1)) / 5", not "(13 * m - 1) / 5". This made me confused (see for example https://en.wikipedia.org/wiki/Zeller%27s_congruence) - the return for weekday number in the algorithm uses 0 = Saturday, which does not appear to be corrected for (libshared code presumes 0 = Sunday). Rather than figure out the correctness myself, this patch replaces the algorithm with one specified in RFC3339, which should be canonical, and which is 0-Sunday based. It passed all the same tests. Probably more tests should be added.
…== 1 This patch makes the parsing of week numbers in dates ISO-8601 compliant in the case that Datetime::weekstart == 1, while the existing behavior remains available if Datetime::weekstart == 0. The previous code parsed week numbers (given as "yyyy-Www") into dates starting on Sunday. Although the code had a "Datetime::weekstart" variable, and this value was initialized to 1 (which is Monday) in libshared, nonetheless week specifications would be parsed into calendar days starting on Sunday. Furthermore, week days in such given weeks ('d' in "yyyy-Www-d") used 0-6 for Sunday-Monday, while ISO8601 specifies 1-7 Monday-Sunday. Therefore, yyyy-Www-0 was treated as valid (and should not be), while yyyy-Www-7 was invalid (and should be valid). Note that neither Taskwarrior nor Timewarrior ever set the value of Datetime::weekstart. Taskwarrior has an rc.weekstart, but this is only used for "task calendar" output, not for parsing dates. The patch does the following: - Initialize "Datetime" instances with a weekday value from Datetime::weekstart. This helps the case when weekday is not supplied, it won't default to zero and fail validation (when weekstart is '1'). Note that mktime() is usually used in the code to convert populated "struct tm" broken-down times into epoch-times, and this operation does not use tm.tm_wday for input, only as an output field, recomputed as a normalized value, so it appears to be safe to initialize it with a 1 (which we might wonder about since .tm_wday is supposed to be 0-6 Sunday based). - Use the already-existing Datetime::parse_weekday() to parse the 'ww' in "yyyy-Www" input dates (the function was not used by any caller previously; guessing it may have been intended for eventual use in order to respect weekstart(?)) - Teach parse_weekday() about weekstart. Previously this assumed 1-7, which is the reason I'm guessing this was intended to be used for ISO weeks eventually. Now it can also use 0-6 if weekstart 0. - Teach Datetime::validate to respect Datetime::weekstart also. Previously only 0-6 Sunday-Monday was considered valid. - Default the weekday to Datetime::weekstart if not supplied, ie for "yyyy-Www-X" if the "-X" is not supplied, as recognized when Datetime::standaloneDateEnabled = true, which is the case for (1) tests, (2) timewarrior, but NOT for taskwarrior at this time (both the standalone 'calc' and 'task calc' (ie Context.cpp) set this to false). - Implement the complete ISO week calculation algorithm in Datetime::resolve() if weekstart is '1', and keeps the existing algorithm if weekstart is '0'. This will allow Taskwarrior and Timewarrior to offer the option of the old behavior for those who want to use Sunday-based weeks and ignore ISO week calculations (such as 2023-01-01 being in ISO week 2022-W52).
We make a loop over all the testParse() calls, running them twice: first with weekstart 0, then with weekstart 1. We have to offset by one day the time_t variables given for comparison (last parameter to testParse()), corresponding to the first day in week, for weeks given without a day offset. So local1, utc1, f_local1, f_utc1 are tweaked by one day for the second loop (weekstart 1). Likewise, absolute dates like 2013-12, which test against these offset values, need to be de-offset depending on which loop iteration so there is an inline adjustment to the time_t value passed for those specific 4 test calls (2013-12 and 9850-12, with and without the '-'). Patch is best viewed with "diff -w".
Since ISO8601 defines weeks as the first one with 4 days in it, and they are always whole weeks, during certain years they differ from their Gregorian equivalents at year-boundaries. We add tests for some of these years, when weekstart = 1 (ISO8601 algorithm in effect), to ensure proper resolution.
67482e9
to
d43bedc
Compare
…== 1 This is a duplicate of the changes from commit with same subject line in GothenburgBitFactory/libshared#81 and also duplicates the commit message below. Timewarrior has copies of many functions from libshared's Datetime class in its own DatetimeParser class, and until such time as these classes are integrated, we have to maintain copies of these functions here, and the changes need to be duplicated. See discussion in aforementioned PR. The one difference with the patch over there is, this one is using the public Datetime::dayOfWeek rather than its own implementation. The copy in Timewarrior already was doing this before, but it's worth noting it's the only difference with the corresponding patch in libshared PR 81. * * * This patch makes the parsing of week numbers in dates ISO-8601 compliant in the case that Datetime::weekstart == 1, while the existing behavior remains available if Datetime::weekstart == 0. The previous code parsed week numbers (given as "yyyy-Www") into dates starting on Sunday. Although the code had a "Datetime::weekstart" variable, and this value was initialized to 1 (which is Monday) in libshared, nonetheless week specifications would be parsed into calendar days starting on Sunday. Furthermore, week days in such given weeks ('d' in "yyyy-Www-d") used 0-6 for Sunday-Monday, while ISO8601 specifies 1-7 Monday-Sunday. Therefore, yyyy-Www-0 was treated as valid (and should not be), while yyyy-Www-7 was invalid (and should be valid). Note that neither Taskwarrior nor Timewarrior ever set the value of Datetime::weekstart. Taskwarrior has an rc.weekstart, but this is only used for "task calendar" output, not for parsing dates. The patch does the following: - Initialize "Datetime" instances with a weekday value from Datetime::weekstart. This helps the case when weekday is not supplied, it won't default to zero and fail validation (when weekstart is '1'). Note that mktime() is usually used in the code to convert populated "struct tm" broken-down times into epoch-times, and this operation does not use tm.tm_wday for input, only as an output field, recomputed as a normalized value, so it appears to be safe to initialize it with a 1 (which we might wonder about since .tm_wday is supposed to be 0-6 Sunday based). - Use the already-existing Datetime::parse_weekday() to parse the 'ww' in "yyyy-Www" input dates (the function was not used by any caller previously; guessing it may have been intended for eventual use in order to respect weekstart(?)) - Teach parse_weekday() about weekstart. Previously this assumed 1-7, which is the reason I'm guessing this was intended to be used for ISO weeks eventually. Now it can also use 0-6 if weekstart 0. - Teach Datetime::validate to respect Datetime::weekstart also. Previously only 0-6 Sunday-Monday was considered valid. - Default the weekday to Datetime::weekstart if not supplied, ie for "yyyy-Www-X" if the "-X" is not supplied, as recognized when Datetime::standaloneDateEnabled = true, which is the case for (1) tests, (2) timewarrior, but NOT for taskwarrior at this time (both the standalone 'calc' and 'task calc' (ie Context.cpp) set this to false). - Implement the complete ISO week calculation algorithm in Datetime::resolve() if weekstart is '1', and keeps the existing algorithm if weekstart is '0'. This will allow Taskwarrior and Timewarrior to offer the option of the old behavior for those who want to use Sunday-based weeks and ignore ISO week calculations (such as 2023-01-01 being in ISO week 2022-W52). Signed-off-by: Scott Mcdermott <scott@smemsh.net>
…== 1 This is a duplicate of the changes from commit with same subject line in GothenburgBitFactory/libshared#81 and also duplicates the commit message below. Timewarrior has copies of many functions from libshared's Datetime class in its own DatetimeParser class, and until such time as these classes are integrated, we have to maintain copies of these functions here, and the changes need to be duplicated. See discussion in aforementioned PR. The one difference with the patch over there is, this one is using the public Datetime::dayOfWeek() and Datetime::daysInYear() methods from libshared, rather than its own implementation. The copy in Timewarrior already was doing this before, but it's worth noting it's the only difference with the corresponding patch in libshared PR 81, and only amounts to a change in the namespace qualifier. Copied commit message from libshared follows. * * * This patch makes the parsing of week numbers in dates ISO-8601 compliant in the case that Datetime::weekstart == 1, while the existing behavior remains available if Datetime::weekstart == 0. The previous code parsed week numbers (given as "yyyy-Www") into dates starting on Sunday. Although the code had a "Datetime::weekstart" variable, and this value was initialized to 1 (which is Monday) in libshared, nonetheless week specifications would be parsed into calendar days starting on Sunday. Furthermore, week days in such given weeks ('d' in "yyyy-Www-d") used 0-6 for Sunday-Monday, while ISO8601 specifies 1-7 Monday-Sunday. Therefore, yyyy-Www-0 was treated as valid (and should not be), while yyyy-Www-7 was invalid (and should be valid). Note that neither Taskwarrior nor Timewarrior ever set the value of Datetime::weekstart. Taskwarrior has an rc.weekstart, but this is only used for "task calendar" output, not for parsing dates. The patch does the following: - Initialize "Datetime" instances with a weekday value from Datetime::weekstart. This helps the case when weekday is not supplied, it won't default to zero and fail validation (when weekstart is '1'). Note that mktime() is usually used in the code to convert populated "struct tm" broken-down times into epoch-times, and this operation does not use tm.tm_wday for input, only as an output field, recomputed as a normalized value, so it appears to be safe to initialize it with a 1 (which we might wonder about since .tm_wday is supposed to be 0-6 Sunday based). - Use the already-existing Datetime::parse_weekday() to parse the 'ww' in "yyyy-Www" input dates (the function was not used by any caller previously; guessing it may have been intended for eventual use in order to respect weekstart(?)) - Teach parse_weekday() about weekstart. Previously this assumed 1-7, which is the reason I'm guessing this was intended to be used for ISO weeks eventually. Now it can also use 0-6 if weekstart 0. - Teach Datetime::validate to respect Datetime::weekstart also. Previously only 0-6 Sunday-Monday was considered valid. - Default the weekday to Datetime::weekstart if not supplied, ie for "yyyy-Www-X" if the "-X" is not supplied, as recognized when Datetime::standaloneDateEnabled = true, which is the case for (1) tests, (2) timewarrior, but NOT for taskwarrior at this time (both the standalone 'calc' and 'task calc' (ie Context.cpp) set this to false). - Implement the complete ISO week calculation algorithm in Datetime::resolve() if weekstart is '1', and keeps the existing algorithm if weekstart is '0'. This will allow Taskwarrior and Timewarrior to offer the option of the old behavior for those who want to use Sunday-based weeks and ignore ISO week calculations (such as 2023-01-01 being in ISO week 2022-W52). Signed-off-by: Scott Mcdermott <scott@smemsh.net>
@lauft this is now implemented in GothenburgBitFactory/timewarrior#633. I would like to get this one merged first please, so I can update the libshared peg over there in that PR before it gets merged. In addition to keeping code parity between the duplicated methods, the timewarrior code uses this library's Once these are done, I will go over to Taskwarrior and see what they want to do wrt maybe passing their Furthermore I will commit to monitoring both this repo and timewarrior for changes to the duplicated functions I touched, and keep them in sync until they get merged following your earlier comments. I have a script developed to automate comparison, and am subscribed to all updates in both repos. Please apply, thanks. |
@smemsh Merged, thanks a lot for your contribution. 👍🏻 |
Brings in new dayOfWeek() code from GothenburgBitFactory/libshared#81 Signed-off-by: Scott Mcdermott <scott@smemsh.net>
…== 1 This is a duplicate of the changes from commit with same subject line in GothenburgBitFactory/libshared#81 and also duplicates the commit message below. Timewarrior has copies of many functions from libshared's Datetime class in its own DatetimeParser class, and until such time as these classes are integrated, we have to maintain copies of these functions here, and the changes need to be duplicated. See discussion in aforementioned PR. The one difference with the patch over there is, this one is using the public Datetime::dayOfWeek() and Datetime::daysInYear() methods from libshared, rather than its own implementation. The copy in Timewarrior already was doing this before, but it's worth noting it's the only difference with the corresponding patch in libshared PR 81, and only amounts to a change in the namespace qualifier. Copied commit message from libshared follows. * * * This patch makes the parsing of week numbers in dates ISO-8601 compliant in the case that Datetime::weekstart == 1, while the existing behavior remains available if Datetime::weekstart == 0. The previous code parsed week numbers (given as "yyyy-Www") into dates starting on Sunday. Although the code had a "Datetime::weekstart" variable, and this value was initialized to 1 (which is Monday) in libshared, nonetheless week specifications would be parsed into calendar days starting on Sunday. Furthermore, week days in such given weeks ('d' in "yyyy-Www-d") used 0-6 for Sunday-Monday, while ISO8601 specifies 1-7 Monday-Sunday. Therefore, yyyy-Www-0 was treated as valid (and should not be), while yyyy-Www-7 was invalid (and should be valid). Note that neither Taskwarrior nor Timewarrior ever set the value of Datetime::weekstart. Taskwarrior has an rc.weekstart, but this is only used for "task calendar" output, not for parsing dates. The patch does the following: - Initialize "Datetime" instances with a weekday value from Datetime::weekstart. This helps the case when weekday is not supplied, it won't default to zero and fail validation (when weekstart is '1'). Note that mktime() is usually used in the code to convert populated "struct tm" broken-down times into epoch-times, and this operation does not use tm.tm_wday for input, only as an output field, recomputed as a normalized value, so it appears to be safe to initialize it with a 1 (which we might wonder about since .tm_wday is supposed to be 0-6 Sunday based). - Use the already-existing Datetime::parse_weekday() to parse the 'ww' in "yyyy-Www" input dates (the function was not used by any caller previously; guessing it may have been intended for eventual use in order to respect weekstart(?)) - Teach parse_weekday() about weekstart. Previously this assumed 1-7, which is the reason I'm guessing this was intended to be used for ISO weeks eventually. Now it can also use 0-6 if weekstart 0. - Teach Datetime::validate to respect Datetime::weekstart also. Previously only 0-6 Sunday-Monday was considered valid. - Default the weekday to Datetime::weekstart if not supplied, ie for "yyyy-Www-X" if the "-X" is not supplied, as recognized when Datetime::standaloneDateEnabled = true, which is the case for (1) tests, (2) timewarrior, but NOT for taskwarrior at this time (both the standalone 'calc' and 'task calc' (ie Context.cpp) set this to false). - Implement the complete ISO week calculation algorithm in Datetime::resolve() if weekstart is '1', and keeps the existing algorithm if weekstart is '0'. This will allow Taskwarrior and Timewarrior to offer the option of the old behavior for those who want to use Sunday-based weeks and ignore ISO week calculations (such as 2023-01-01 being in ISO week 2022-W52). Signed-off-by: Scott Mcdermott <scott@smemsh.net>
Brings in new dayOfWeek() code from GothenburgBitFactory/libshared#81 Signed-off-by: Scott Mcdermott <scott@smemsh.net>
This PR implements the ISO8601 algorithm for
Datetime::weekstart == 1
while retaining existing behavior forDatetime::weekstart == 0
.Existing Behavior
There are some issues with the existing week number parsing:
2024-W31
.sow
,sonw
, which are Monday-based.:week
,:lastweek
.5
in2013-W49-5
) use 0-6 Sunday-Monday syntax rather than 1-7 Monday-Sunday as specified in ISO8601Datetime::weekstart
set in libshared does not change any of this, whether set to 0 or 1.For example
2013-W49
is a frequently tested case intest/datetime.t.cpp
:the returned date is a Sunday:
the calculation is using Sunday-based weeks:
Timewarrior was parsing weeks as Sunday-based also:
this despite its
:week
hint starting on Monday:Finally, ISO weeks are numbered 1-7 (Monday - Sunday), whereas this library had used 0-6 (Sunday - Monday). The existing usage is simpler because 0-6 corresponds to the
.tm_wday
field values, however this field is never used as input:mktime()
uses it only as an output field, recomputed from day/month/year, and libshared is typically usingmktime()
to get a number of seconds, to change into a Julian (ordinal offset from Jan 1) and this is how it will come up with a date (if not directly from.tm_year
,.tm_mon
,tm_mday
as it does in some places).New Behavior
ISO8601 defines weeks as starting on Monday, and the first week of the year is the one containing the first Thursday. Furthermore, all years are to contain an exact integer multiple of weeks. This means that, for example Jan 1 2023 is actually ISO week 52 in year 2022:
The new code is uses the ISO calculations (only when weekstart is 1):
The new ISO8601 behavior for
Datetime::weekstart == 1
will be a breaking change for users relying on existing week date parsing. This is unavoidable forweekstart == 1
. However if the user wantsweekstart == 0
, the old behavior is still possible if Taskwarrior/Timewarrior were modified to setDatetime::weekstart
before libshared calls (which they currently do not). Taskwarrior could propagate itsrc.weekstart
value (used only fortask calendar
so far) and propagate this value to libshared, for example.Week numbers will be "off by one" compared to Sunday-based weeks, on the first day of the week, and there are some differences around the year boundary weeks.
Rationale
Monday-based weeks, and in particular ISO8601 are used broadly in business and finance, commonly in most of the world, and increasingly in the USA as well, although Sunday weeks still remain in colloquial use there.
Since all the existing week shortcut names available to users assume Monday-based weeks (like taskw
sonw
and timew:lastweek
), this PR is adding consistency, while adhering to the widely accepted standard ISO8601, and leaving a path open for keeping Sunday-based behavior by library users if they setDatetime::weekstart = 0
.Tests
I have updated tests to run all the
testParse()
with both weekstarts (0 or 1), and made sure the code works with both values.Also added tests for a bunch of dates around year boundaries which differ between ISO and non-ISO interpretations. These are only tested for when weekstart 1, as the weekstart 0 case is already covered by existing tests and should contain no "surprise" results such as happens with ISO year numbers differing from calendar years in cases when first week does not contain Thursday.
See commit messages for additional information.
Fixes #80