-
-
Notifications
You must be signed in to change notification settings - Fork 569
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support milliseconds in cacheDuration parsing #586
Conversation
Somewhat cheekily I've pushed another commit that reformats the regexp for readability. I was restricted in what I could do since this gem supports ruby 1.8.7, which doesn't have named capture groups. I think the split lines and comments make it a lot easier to understand and work with, but I'm more than happy to get rid of that commit if it's not preferred. I've ensured the build passes with ruby 2.6.6, but I can't install ruby 1.8.7 on my mac any more. Is there any CI in place for this gem that tests a range of versions? |
00638e2
to
725dab8
Compare
| | ||
(\d+)W # 8: Weeks | ||
) | ||
$)x |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is this x?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's the "ignore whitespace" regexp flag. It's what allows me to split the regexp over multiple lines, and also insert comments. You can see it here documented for 2.6, but it's supported back to 1.8.x:
https://ruby-doc.org/core-2.6.4/Regexp.html#class-Regexp-label-Options
We use travis. but it seems the travis.org was frozen, so we need to move to travis.com |
725dab8
to
cb98e39
Compare
Edit: this can broadly be ignored, see next comment. Thanks for the notes, @johnnyshields! All of your points are good ones, but I'm a little wary of expanding the scope of this PR to attempt full coverage of the ISO8601 duration spec, given the indirect test coverage we've got to work with. Since two independent PRs have been raised for troubles with milliseconds specifically, I'd probably favour merging this to begin with and looking at full coverage as a follow-up. I did have a quick look for libraries that could do this whole task for us, but unfortunately the most commonly-used one (outside of ActiveSupport), ruby-duration, requires ruby 1.9, while ruby-saml still supports 1.8.7. They do point to the last version that supported earlier rubies, though, so we could potentially do another conditional dependency in the gemspec. @pitbulk what do you think? Are you open to pulling in an extra dependency for this task, or would you prefer to continue updating this gem's implementation? |
3ff0b69
to
f65c207
Compare
In fact, after a bit of spec poking I think we actually weren't so far from having full spec coverage, so I've added a couple of commits to do that; we shouldn't need another dependency. I've also rebased to pick up the new github actions build. |
f65c207
to
35fd277
Compare
Can we add more unitest to cover the different scenarios? maybe you can carry them from the gem dependency you found |
412d664
to
ac85010
Compare
This adds some more comprehensive unit testing for the .parse_duration helper, since we'd like to modify it for better coverage of the ISO8601 duration spec. Since the helper both parses the duration and adds it to the current time (by default) before returning it, I've used the unix epoch as a reference point instead and formatted the result as an ISO timestamp before asserting, to hopefully make any errors clearer. I've not gone for full coverage, but have tried to test the main edge cases. In subsequent commits I'll refactor the parsing and introduce support for fractional values.
The ISO 8601 parsing regexp is moderately complex, and uses numbered backreferences to extract data, which is prone to breakage if new capture groups are introduced (either inadvertently or deliberately). This also means every regexp group we don't want to capture must be anonymised using (?: ...). In modern rubies we could use named capture groups to clarify things; however since these are only supported since ruby 1.9 and ruby-saml supports 1.8.7, that's not an option. However, we can still use /x to introduce whitespace and source code comments, which this commit does, splitting each meaningful capture group onto its own annotated line. Doing this, it also became clear that some of the groups weren't necessary, e.g. the two groups surrounding each part of the ( YMDHMS format | week format) disjunction, so I've reduced the nesting somewhat, too.
ac85010
to
5b065f4
Compare
@pitbulk Sure, yep, I've:
The tests pass at each stage. I didn't borrow the examples from the other gem in the end, because our method applies the duration to a timestamp as well as computing it whereas theirs produces a duration object, but I think I've gotten reasonable coverage of the major cases we want to support. See what you think! |
"P1Y1M1DT1H1M1S" => "1971-02-02T01:01:01.000Z", | ||
|
||
# Negative duration | ||
"-P1Y1M1DT1H1M1S" => "1968-11-29T22:58:59.000Z", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are negative values supported for each unit, e.g. P1Y-1M
== 11 months?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not certain that negative durations are part of the spec, as it mentions them only once, to say:
2.1.6
duration
non-negative quantity attributed to a time interval
From conversations in the moment.js repo (1, 2) it appears that this is an area where implementations vary, some supporting prefix signs, some supporting infix.
For the purposes of this PR, I'm just trying to cover the existing code, which supports a prefix sign alone.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some info about the xsd:duration type: http://www.datypic.com/sc/xsd/t-xsd_duration.html
It seems infix should not be valid.
The numbers may be any unsigned integer, with the exception of the number of seconds, which may be an unsigned decimal number. So P15.5Y seems invalid
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, cool, thanks for the extra info!
As a result of the earlier conversations I've permitted decimal hours and minutes; would you like me to back that change out, since the xsd:duration spec is tighter than the ISO one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes please, and sorry for providing late this info.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No worries, glad to have it resolved! I've updated the branch and referred to the xsd:duration docs in the commit message.
Since the introduction of cacheDuration parsing in 25cbddd we've been seeing parsing failures for one of our IdPs, whose cacheDuration value is set to cacheDuration="PT6H0M0.000S" This is a valid ISO8601 duration, however the regexp being used for parsing doesn't provide for the possibility of milliseconds. The ISO8601 spec[1] supports decimal values for at least some of a duration's components (S4.4.3.2): > If necessary for a particular application, the lowest order > components may have a decimal fraction. It's not entirely plain which components this might refer to, however the xsd:duration spec[2] is plainer: > The numbers may be any unsigned integer, with the exception of the > number of seconds, which may be an unsigned decimal number. This commit therefore adds fractional seconds support to the duration parsing. To ensure that we actually respect these values, I've updated our parsed value coercion to produce floats for all components. This is arguably unnecessary for the non-float components (since the regexp will only accept integers), but it allows us to reduce the code with no ill effects. The ISO spec permits arbitrary decimal precision, and supports both commas and dots as the decimal separator, so since the xsd:duration schema doesn't mention either concern, I've followed the ISO spec here. [1] https://www.loc.gov/standards/datetime/iso-tc154-wg5_n0038_iso_wd_8601-1_2016-02-16.pdf [2] http://www.datypic.com/sc/xsd/t-xsd_duration.html
We can condense our code a bit by handling the sign at the point where we coerce the duration parts from strings to numbers.
5b065f4
to
f6366d4
Compare
Let's merge? |
Since the introduction of
cacheDuration
parsing in 25cbddd we've been seeing metadata parsing failures for one of our IdPs, whosecacheDuration
value is set to"PT6H0M0.000S"
This seems like a valid ISO8601 duration, however the regexp being used for parsing doesn't allow for the possibility of milliseconds.
This PR adds an optional non-captured group to the regexp to permit them. Since the captured string value is already being converted using
.to_f
, this should work without further changes.To achieve minimal test coverage I've added a seconds string to one of the existing metadata examples with a zero value; this reproduces the failure and confirms the fix without needing to add a specific test case.
Feedback
I'm new to this project so absolutely any feedback would be welcome. In particular, is the approach taken to test coverage sufficient, or would you prefer to add an explicit test case? This seemed simpler since the duration parsing isn't directly tested, but I'm happy to introduce unit tests there if that's preferred.