-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for TruffleRuby in the gem #35
Conversation
* This is necessary since lib/strscan.rb was added (ruby#25), as the TruffleRuby stdlib strscan.rb is no longer found first by `require "strscan"`. * Write conditions in a way that any other Ruby implementation would simply use its stdlib until it is added explicit support in this gem.
FYI, this is the strscan implementation in TruffleRuby: https://github.com/oracle/truffleruby/blob/master/lib/truffle/strscan.rb |
Why is the special case for TruffleRuby not guarded with The default case for this gem is the "ruby" engine (CRuby), which should build and load the C extension. JRuby is an exception because we load a jar file and then boot the extension class. TruffleRuby is an exception because it is not directly supported by this gem and this gem's code will not be loaded. Only the exceptions should be guarded, falling back on the default (CRuby, engine == "ruby"). |
I should also mention, strscan.rb was explicitly excluded from the gem because its exception only applied to JRuby (platform = "java"). This will mean CRuby starts shipping both strscan.rb and strscan.so whenever this is released. |
It is also worth pointing out that this change makes the strscan gem basically a no-op on TruffleRuby without anyone knowing. Users installing new versions of strscan to get new features or bug fixes (or security fixes, should that be necessary) will not get those changes and will not be informed why that is the case. I understand you do not want to ship your strscan.rb outside of TruffleRuby, but perhaps there's a better alternative than making this gem silently do nothing at all when running on TruffleRuby. |
It's written in the PR description, and explained in more details here: https://bugs.ruby-lang.org/issues/18567#note-30. I don't think any argument could counter the advantage of it works instead of breaking for non-CRuby implementations for new default extension gems. Hence the "else" case must be "any other Ruby implementation". Regarding keeping
I think users care little about that. They care that it works and that it's secure, and in fact TruffleRuby strscan.rb is more secure than a C extension. |
Except that this has been the standard for all gems with C extensions due to the default target being the "ruby" engine using C extensions. This changes the fallback case from the C extension to "just try to load it from stdlib and pray". I cannot speak for the other gem maintainers as to whether this is acceptable.]
I said I understand that you don't want to ship it. That's your prerogative, but it require adding another exception to the current default of using a C extension.
I can speak from experience that when users install updated gems, expecting updated functionality, and then those gems don't actually update anything, they get confused and file issues. Those issues may end up in this repository, and we will have to explain that the strscan gem doesn't really ship support for TruffleRuby, and upgrading the gem will not upgrade strscan functionality in TruffleRuby. Perhaps we can at least add an install-time message pointing out that the strscan gem does not actually ship any code for "any other Ruby implementation" and just passes through to stdlib? |
Another point to discuss: you pend some tests that are not supported by TruffleRuby's strscan, but you are masking them in this repository. Any time a new feature is added or a bug is fixed here, the related tests will have to be masked until TruffleRuby ships its updated support. JRuby currently passes all tests except for one on Windows that I will fix on the JRuby side before release. Any new bug fixes or features will be added in parallel to the JRuby extension. |
I also want to make clear I'm not opposed to this pull request, but it is adding several exceptional changes specifically to allow truffle ruby to keep its implementation out of the gem. That's the decision that you have made, and I sympathize with the challenge of getting these standard gems to work properly. My concerns are primarily that the gem will now silently do nothing on implementations other than JRuby and CRuby, and we may have to mask out tests for future changes just for TruffleRuby. |
You make it sound like this is unreliable but absolutely not, it's fully supported. That is exactly what happens if there is no other version of the gem installed, and the shipped-with-myruby in-stdlib default gem is used, it behaves exactly like a stdlib, and that will very often be the case for
It sounds rather unlikely to me for strscan, but it is indeed a possibility. In general TruffleRuby stays pretty close feature-wise to such gems which delegate to stdlib, and I already have experience with FFI for this.
Seems of little concern given the few pended tests, and it's useful to actually note the differences. |
Instead of not working at all, or by luck already using the stdlib version (if Gemfile.lock version = default gem/stdlib version, or if strscan.rb takes precedence over strscan.so when there was no strscan.rb in this gem), so from that point of view it can't be any worse. |
strscan.gemspec
Outdated
@@ -24,7 +24,7 @@ Gem::Specification.new do |s| | |||
s.files = %w{lib/strscan.jar lib/strscan.rb} | |||
s.platform = "java" | |||
else | |||
s.files = %w{ext/strscan/extconf.rb ext/strscan/strscan.c} | |||
s.files = %w{ext/strscan/extconf.rb ext/strscan/strscan.c lib/strscan.rb} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know why we need this.
If we don't install lib/strscan.rb
, strscan.rb
in TruffleRuby will be loaded. Because there are no files that match require "strscan"
in the installed gem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One practical reason is otherwise bundle exec rake test
can't work without this (because the file exists in the repo, regardless of the current gem platform).
If the tests were actually done only on an installed gem (would be good, but maybe as an extra CI step) that wouldn't be an issue, but that's uncommon and arguably impractical in development.
I think it is good to have thisgem/lib/strscan.rb
in $LOADED_FEATURES, as a way to show the gem was loaded (and for TruffleRuby then delegated to stdlib). In fact just looking at $LOADED_FEATURES shows what happened, which I find a very good thing. Example:
$ ruby -e 'require "strscan"; puts $"
...
/home/eregon/.rubies/truffleruby-dev/lib/mri/rubygems/path_support.rb
/home/eregon/.rubies/truffleruby-dev/lib/truffle/strscan.rb
/home/eregon/.rubies/truffleruby-dev/lib/gems/gems/strscan-3.0.3/lib/strscan.rb
Another reason is I feel is it's extremely confusing to only include some lib/*.rb
on some gem platforms and not on others, I think nobody expects that and I'd think it also wouldn't work e.g. with gem 'strscan', github: 'ruby/strscan'
or gem 'strscan', path: '...'
in a Gemfile.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a CI step to run tests on the installed gem, that seems a really good addition because it will be as close as we can to an actual usage of the gem.
That works fine except for 2.4 which doesn't have strscan as a default gem, so just skipped on 2.4.
(we can't use bundle exec ruby run-test.rb
for 2.4 as that would add $PWD/lib to $LOAD_PATH and not load the gem, due to I think the gemspec
in Gemfile
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One practical reason is otherwise
bundle exec rake test
can't work without this (because the file exists in the repo, regardless of the current gem platform).
One precision, for bundle exec rake test
we need changes to lib/strscan.rb
(as in this PR).
That itself doesn't force us to add to lib/strscan.rb
so s.files
.
But I find it very dangerous and brittle to have such a large difference between testing the gem locally with bundle exec rake test
and an actual real world usage of the gem. lib/strscan.rb
should exist in both or neither, otherwise we're testing completely different things, and the other points I mentioned above apply as well.
Also this would be required for a Ruby implementation which wants to delegate to the strscan.so
in stdlib (i.e., doesn't implement in a strscan.rb
file but some strscan.someotherext
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the tests were actually done only on an installed gem (would be good, but maybe as an extra CI step) that wouldn't be an issue, but that's uncommon and arguably impractical in development.
csv gem does this https://github.com/ruby/csv/blob/master/.github/workflows/test.yml#L74 because it can detect packaging problem. And it's useful to detect packaging problem before we release a gem.
I think it is good to have
thisgem/lib/strscan.rb
in $LOADED_FEATURES, as a way to show the gem was loaded (and for TruffleRuby then delegated to stdlib).
Why is the information needed? It will just confuse users: "I want a new feature introduced by X.Y.Z but can't use the feature even when strscan X.Y.Z is loaded." Because TruffleRuby's bulit-in strscan may not be implemented the new feature yet. If thisgem/lib/strscan.rb
isn't in $LOADED_FEATURES
, users can think that "I can't use the new feature because strscan X.Y.Z isn't used".
Another reason is I feel is it's extremely confusing to only include some
lib/*.rb
on some gem platforms and not on others, I think nobody expects that and I'd think it also wouldn't work e.g. withgem 'strscan', github: 'ruby/strscan'
orgem 'strscan', path: '...'
in a Gemfile.
I don't think so. (BTW, I feel that it's extremely confusing gem "strscan", "= X.Y.Z"
is just ignored on TruffleRuby.)
I think that nobody doesn't care about it.
gem 'strscan', path: '...'
doesn't work only on TruffleRuby. But I don't think it's a problem because TruffleRuby users don't need to use it. TruffleRuby always uses built-in strscan.
How about moving lib/strscan.rb
to lib/jruby/strscan.rb
and using strscan.rb
only with JRuby? a2b7f6f
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
csv gem does this https://github.com/ruby/csv/blob/master/.github/workflows/test.yml#L74 because it can detect packaging problem. And it's useful to detect packaging problem before we release a gem.
That looks good, it's very similar to what I did but has a stronger guarantee the lib/
isn't accidentally on $LOAD_PATH
for that step.
Why is the information needed? [...] users can think that "I can't use the new feature because strscan X.Y.Z isn't used".
Because users/developers might want to ensure the strscan gem is loaded/activated (e.g. I and other TruffleRuby devs certainly want to see that, it's very helpful for debugging what gets loaded, and e.g., the puts $".grep(/\/strscan\./)
output would be unclear without this), otherwise they might think it's a bug of Bundler or of their usage (but it's not). The fact strscan-1.x.y/lib/strscan.rb
is loaded, as well as truffleruby/lib/truffle/strscan.rb
and not strscan.so
should be pretty clear that the used code is from the truffleruby stdlib. I think this is by far the clearest result for everyone.
Not having a single file from the gem in $LOADED_FEATURES sounds like the gem was not used/activated at all, even though that's not true.
We can't see the gem in $LOAD_PATH (except by debugging in lib/strscan.rb
), even though it was added to $LOAD_PATH, but we need to remove from $LOAD_PATH to actually load the stdlib. So at least we should preserve the trace that the gem was used/activated in $LOADED_FEATURES.
BTW, I feel that it's extremely confusing gem "strscan", "= X.Y.Z" is just ignored on TruffleRuby.
It's not ignored with the lib/strscan.rb, and we'll see strscan-X.Y.Z is in $", that's part of the point :)
The behavior might not be completely the same, that's anyway true for any version due to a different implementation of StringScanner (and it's also true for the Java extension in this gem, unless you also modify it accordingly for every change).
I think users can quickly figure out if e.g. they use a new method and it works on CRuby but not on TruffleRuby it must TruffleRuby's fault. And even in case they file an issue here I'll handle it.
How about moving
lib/strscan.rb
tolib/jruby/strscan.rb
and usingstrscan.rb
only with JRuby? a2b7f6f
I dislike that, it's more magic, more difference between local development and the packaged gem (this feels hacky to me), and it's brittle. For instance that would break if at some point TruffleRuby (or some other Ruby implementation) decides to implement some of StringScanner in a C extension in stdlib, because then the "wrong" strscan.so could get loaded by require "strscan.so"
from truffleruby's strscan.rb (require_relative
could help, but necessarily easy as .so
are not always siblings of .rb
).
Or if TruffleRuby (or some other Ruby implementation) implements all of StringScanner in some strscan.so
or strscan.someextensionext
then it would also break.
Or if there is at some point some Ruby file like lib/strscan/foo.rb
in this gem, then it could be loaded from the gem instead of truffleruby stdlib, which is bad unless considered carefully whether that's safe.
And there is a performance advantage of having a lib/strscan.rb on CRuby too, it will avoid scanning the $LOAD_PATH for every require "strscan"
after the first one: https://bugs.ruby-lang.org/issues/10902. That seems a significant advantage.
Ultimately if this is the approach you decide to take, then go ahead and I'll adapt the PR. However, if that approach is used I'm pretty sure I (and likely others too) will waste time to debug the magic of why there is no trace of the gem loaded in $", why local dev is so different than the real gem, etc. I think always having lib/strscan.rb
is respecting the simplicity principle of programming.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not having a single file from the gem in $LOADED_FEATURES sounds like the gem was not used/activated at all, even though that's not true.
We should use Gem.loaded_specs
or something, or see Gemfile.lock
instead of using $LOADED_FEATURES
to confirm whether strscan gem is activated or not. Because they are the master information.
We can't see the gem in $LOAD_PATH (except by debugging in
lib/strscan.rb
), even though it was added to $LOAD_PATH, but we need to remove from $LOAD_PATH to actually load the stdlib.
Generally, we should not modify $LOAD_PATH
in a library to keep simple and easy to debug. (It's more magic. :-)
The behavior might not be completely the same, that's anyway true for any version due to a different implementation of StringScanner (and it's also true for the Java extension in this gem, unless you also modify it accordingly for every change).
We'll keep synchronizing the C implementation and Java implementation for new behavior. I think that it's one of reasons that Java implementation joins to this repository.
I dislike that, it's more magic, more difference between local development and the packaged gem (this feels hacky to me), and it's brittle.
Sorry. It's my fault. I should have write it as 390cebb#diff-188ab539bda7d5de390e7146eec51a5644a193d91e445103f86f4735eaaf0b9eR6 to align the configuration in .gemspec
.
For instance that would break if at some point TruffleRuby (or some other Ruby implementation) decides to implement some of StringScanner in a C extension in stdlib, because then the "wrong" strscan.so could get loaded by
require "strscan.so"
from truffleruby's strscan.rb (require_relative
could help, but necessarily easy as.so
are not always siblings of.rb
).
I can't understand this situation. Who does generate the "wrong" strscan.so
? I think that this gem doesn't generate strscan.so
with TruffleRuby with your extconf.rb
change.
Or if TruffleRuby (or some other Ruby implementation) implements all of StringScanner in some
strscan.so
orstrscan.someextensionext
then it would also break.
ditto.
Or if there is at some point some Ruby file like
lib/strscan/foo.rb
in this gem, then it could be loaded from the gem instead of truffleruby stdlib, which is bad unless considered carefully whether that's safe.
CI will (should) detect it as a failure.
And there is a performance advantage of having a lib/strscan.rb on CRuby too, it will avoid scanning the $LOAD_PATH for every
require "strscan"
after the first one: https://bugs.ruby-lang.org/issues/10902. That seems a significant advantage.
It seems that it's fixed in all maintained CRuby. So it doesn't affect to this case.
Ultimately if this is the approach you decide to take, then go ahead and I'll adapt the PR. However, if that approach is used I'm pretty sure I (and likely others too) will waste time to debug the magic of why there is no trace of the gem loaded in $", why local dev is so different than the real gem, etc. I think always having
lib/strscan.rb
is respecting the simplicity principle of programming.
Using $"
is a wrong approach to confirm loaded gems as I said above. And local dev isn't difference with the real gem. Both of them doesn't have lib/strscan.rb
with lib/jruby/strscan.rb
approach.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't understand this situation. Who does generate the "wrong"
strscan.so
? I think that this gem doesn't generatestrscan.so
with TruffleRuby with yourextconf.rb
change.
You're right, since this gem won't generate strscan.so on truffleruby (with this PR) that should no longer be an issue.
CI will (should) detect it as a failure.
That's a bit hopeful, because it might appear to work but some part might not, but it's probably unlikely to have a lib/strscan/foo.rb
.
Both of them doesn't have
lib/strscan.rb
withlib/jruby/strscan.rb
approach.
Alright, at least that's consistent between the gem and local dev. I'll cherry-pick your commit here and adapt the PR for it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kou All done now, could you re-review?
* It is less noisy in the test output, and such skipped tests are not issues of this gem.
Thanks! |
Sorry folks, I should have caught this but the move of lib/strscan.rb to lib/jruby/strscan.rb breaks this gem on JRuby 9.4. It was hidden (appeared to pass) because we accidentally had included strscan.rb in the lib dir, so it still loads ok. I will open an issue and we will need to get strscan.rb back into lib, because that is how we load the extension. |
You know what, I made a mistake. See #38. |
The strscan.rb file is only needed by JRuby, which does not have an extension file format that loads itself. As noted in #35 having this file present on TruffleRuby prevents their built-in strscan.rb from loading, so the file was moved to lib/jruby/strscan.rb with lib/jruby added as a require path. However this results in both lib/strscan.rb and lib/jruby/strscan.rb getting copied into JRuby's stdlib when the default gem is installed, leading to the confusing situation in #38. This PR moves the jruby-specific strscan.rb under the jruby extension, renaming that directory to be explicitly for JRuby, and adding that new ext/jruby/lib path to the require paths so there's no nesting of require paths to copy files multiple times. * ext/java => ext/jruby * lib/jruby/strscan.rb => ext/jruby/lib/strscan.rb * Changes to Rakefile and gemspec for new locations Fixes #38
TruffleRuby stdlib strscan.rb is no longer found first by
require "strscan"
.simply use its stdlib until it is added explicit support in this gem.
Fixes oracle/truffleruby#2420