-
Notifications
You must be signed in to change notification settings - Fork 127
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The console prevents Zeitwerk from autoloading certain constants #408
Comments
Yeah I think it's the same problem: when using stepping commands (e.g. In this case, we can see that Zeitwerk's require 'tmpdir'
Dir.mktmpdir do |dir|
Dir.chdir(dir)
FileUtils.mkdir_p('lib/git/ref/collection')
File.write('lib/git/ref.rb', 'class Git::Ref; end')
File.write('lib/git/ref/collection.rb', 'class Git::Ref::Collection; end')
File.write('lib/git/ref/collection/pagination.rb', 'module Git::Ref::Collection::Pagination; end')
File.write('test.rb', <<~EOS)
require 'zeitwerk'
loader = Zeitwerk::Loader.new
loader.push_dir('lib')
loader.setup
p Git::Ref::Collection::Pagination
EOS
system 'rdbg -e "trace call /on_namespace_loaded/ ;; c" test.rb'
puts "=" * 100
system 'rdbg -e "trace call /on_namespace_loaded/ ;; next 4 ;; p Git::Ref::Collection ;; c" test.rb'
end And here is the output: You can see that without I also noticed that if we don't eval the constant from debugger, the script can still find require 'tmpdir'
Dir.mktmpdir do |dir|
Dir.chdir(dir)
FileUtils.mkdir_p('lib/git/ref/collection')
File.write('lib/git/ref.rb', 'class Git::Ref; end')
File.write('lib/git/ref/collection.rb', 'class Git::Ref::Collection; end')
File.write('lib/git/ref/collection/pagination.rb', 'module Git::Ref::Collection::Pagination; end')
File.write('test.rb', <<~EOS)
require 'zeitwerk'
loader = Zeitwerk::Loader.new
loader.push_dir('lib')
loader.setup
p Git::Ref::Collection::Pagination
EOS
system "rdbg -e 'next 4 ;; c' test.rb"
p "=" * 100
system "rdbg -e 'next 4 ;; p Git::Ref::Collection ;; c' test.rb"
end I don't have an explanation for this behavior though, as the exception is internally rescued and shouldn't affect any callback to load the constants. |
@st0012 that squares with the points (1)–(4) above, right? The top-level require 'tmpdir'
Dir.mktmpdir do |dir|
Dir.chdir(dir)
FileUtils.mkdir_p('lib/foo')
File.write('lib/foo.rb', 'module Foo; end')
File.write('lib/foo/bar.rb', 'Foo::Bar = 1')
File.write('test.rb', <<~EOS)
require 'zeitwerk'
loader = Zeitwerk::Loader.new
loader.push_dir('lib')
loader.setup
p Foo::Bar
EOS
system 'ruby test.rb'
system 'rdbg test.rb'
end |
Yes. And thanks for the simplified script. Here are all the scenarios:
So to conclude: using stepping commands like |
If this seems like a fundamental incompatibility whose root cause is that limitation in Ruby, I would be happy to consider it so, document this gotcha in Zeitwerk docs, and close this issue. What do you think? |
@fxn I think there are 2 problems here: 1) the constant isn't accessible from the console and 2) the constant isn't accessible after 1) happened. For 1), I think it's a Ruby limitation and there's much we can do, at least for current Ruby versions. @ko1 will this be addressed in the future? Regarding 2), I'm not familiar with Ruby's constant loading but I guess it could be caused by constant missing result being cached? If that's the case, I guess we can still try invalidating that cache and let the program load the constant? |
@st0012 to understand 2) you need to understand how Zeritwerk works for that kind of namespaces. Let me explain. Zeitwerk first scans the top-level and sees there's a file
Now, if the user program hits Now, since that callback is not invoked because See? |
Thanks for the explanation.
But that doesn't explain why this scenario would work:
I used both tracers and breakpoints to track the calls, and
It looks like
Do you know how that could happen? |
@st0012 Guess I need to know more about the debugger to understand why is that one puzzling. Why is that different from your example "Execute with debugger but without any stepping"? |
@st0012 Let me detail how that works when things load as expected:
All that happens in the line When we hit
At this point, we have only evaluated
Maybe that was super clear already, in that case please disregard all this :), but since there's something you consider to be odd, I wanted to link autoloading, the tracepoint, and the |
Thanks for the very detailed explanation. I learned a lot from it ❤️ What confuses me is that I can't locate the exact debugger tracepoint that shadows Zeitwek's. I think the current suspect would be the tracepoint that's used by the debug/lib/debug/thread_client.rb Lines 303 to 315 in 8619be2
But from my testing, it should already be disabled before
Script change: next if skip_location?(loc)
+ pp("==================== skip line #{loc} =====================")
next if iter && (iter -= 1) > 0
tp.disable
+ pp("==================== tp disabled =====================") To confirm that this tracepoint isn't the cause of the issue, we can see another example (the one I posted in the prev question):
But if that tracepoint isn't the one that shadows Zeitwerk's, I don't think there's any other tracepoint would do that. What seems to make a difference here is the |
@fxn Ah! Sorry that my previous comment was incorrect. The tracepoint is disabled but the session haven't left its block, which stays at the https://github.com/ruby/debug/blob/master/lib/debug/thread_client.rb#L314 Yeah so I can confirm that it's purely an tracepoint shadowing issue. Sorry for taking your extra time on explaining things repeatedly 🙇 |
Fantastic! Super that it was understood, please count on me for anything ❤️. |
@fxn do you think we should document this gotcha in the debugger readme as well? or this issue will be enough as the future reference for other TracePoint-powered libraries? |
@st0012 it's an edge case, but if users should know about it in case it affects them, I personally believe the README should proactively warn about it. Maybe just a brief discrete section down the bottom. Zeitwerk has a section about debuggers too. |
👋 from the original reporter of this issue! Perhaps I've misread the thread but it seems to me that this is a fairly fundamental conflict between the two gems that might be difficult to resolve without changes to the semantics of tracepoints. Given that this issue significantly reduces the utility of the debug gem with rails dev environments1, perhaps it would be worth reconsidering the use of the proposed @fxn If you are supportive I'd be happy to advocate for this change in ruby and to volunteer to implement the necessary change in Zeitwerk to use the Footnotes
|
@brasic Hey, I just had surgery and I am not in a good condition right now. Let me share quick thoughts, and may continue days later. In principle I am not convinced about pursuing To me, the natural way forward is to remove that limitation in TPs, if that is technically possible. If you define something to be called when an event happens, it should be called when that event happens. |
@brasic re GitHub monolith. Since there's going to be no solution in the short term either way, let me share that the current workaround is to eager load. Of course, far from ideal, but at least there's something. |
Hey Xavier, sorry to hear about the surgery, hope you're doing well and don't feel like you have to respond.
It has been explored but it didn't lead anywhere: https://bugs.ruby-lang.org/issues/15912. I'm really uncertain it's possible.
I don't know about the GitHub monolith, but for Shopify's monolith it means waiting more than 2 minutes. |
I'll add it to the next developer meeting anyway as an alternative to |
Thanks @byroot. Let's see what's the feedback and I'll win some recovery days meanwhile too. If, finally, nested TPs were totally out of the table, I am open to switch to that callback. |
Also if I may: cc @ko1, it would be great to have your opinion on the tracepoint re-entrency capability. |
Another possibility would be to use Fibers like
|
Agreed @eregon. Intuitively, I am suspicious about whether const_added would end up being a good solution once you get into the details. The contract of TP is broken. To me, this is the most important point. If you don't know if your callback is going to work because it depends on other TPs out of your control that may appear at run time, who can use TPs at all? They are not reliable! A debugger should not have a priority on their usage. Indeed, a debugger should ideally transparently work with any program. As I said, even if Zeitwerk is able to not use TP, it's just a side kick, the debugger is still not comparible with any program using TP. |
As said on the ruby tracker I don't understand what makes you think that, as I really don't see how it could possibly be more overhead than I honestly don't have any preference about either solution, but unless the Fiber solution you mentioned is viable, it means we'd need a change to MRI to solve this rather big problem. Since Ruby 3.1 is due in 20 days, I'd like to encourage people to be pragmatic so that we'd hopefully have a working solution in 2022. What worries me with TP re-entrency is that I'm really unsure about the consequences, I might be wrong but I assume it could lead to infinite loops somehow? Maybe I'm just worrying too much? As for the Fiber solution according to the author it's not quite perfect either:
So again, I don't know what the best solution would be, but I'd like to stress that having such a limitation on debuggers inside Rails apps is really a big deal. |
Yes, I meant if it triggers for every constant defined, it's a significant overhead during loading. |
Given the schedule, the Fiber solution in this gem seems by far the most realistic one to me. |
Without having written anything, some things that come to mind off the top of my head:
However, I'd focus the discussion on TP. In my view, TP is broken and of little use due to its uncertainty. I believe the priority should be to address this. Otherwise, what is the point of having TP in the language at all? |
I don't want to enter endless discussion, but that's kinda the point of tracing tools, they are designed to allow observing a running program, not to implement logic, and the one thing they don't want to do is for their callbacks to trigger events. That's why I've always been a bit inconfortable with Zeitwerk's TP usage. Another thing we suggested when we discussed this in Zeitwerk last year was for autoload to accept "sub constants", e.g. I need to look at autoload's implementation to see how hard it would be, but that could be even simpler. |
This could be something interesting to explore. I believe it should unroll autoloads, not eager load. Let me explain. Nowadays, these two examples work: # hotel.rb
class Hotel
include Pricing
end
# hotel/pricing.rb
module Hotel::Pricing
end # hotel.rb
class Hotel
def self.foo; end
end
# hotel/pricing.rb
module Hotel::Pricing
Hotel.foo
end If you load I believe that enhanced autoload should basically embed what we are after: Executing code when |
Hey, just to be clear, that does not mean it would work for Zeitwerk, I don't know that a priori. Zeitwerk is based on a controlled descend in which it is able to monitor and keep track all it does in each level of depth. Even if it was possible to have an equivalent library, I'd have to see how to ship both implementations. The metadata used to operate would be affected, sounds more fundamental than just swapping an adapter. So, worth exploring, and that includes the possibility that the exploration concludes it is not a practical solution. |
Yes that's the idea. Anyway, I'll explore that solution if I have time this week, in the meantime I'll wait for @ko1 & others opinion on either TP re-entrency or the Fiber workaround. |
@ko1 since the thread is long, let me highlight that full TP re-entracy is not strictly needed as far as this issue is concerned. It would suffice to honor TPs listening to the |
So I threw a quick PR to use |
#558 may solve it. |
This avoids the TracePoint conflict with Zeitwerk, which was reported in ruby#408
This avoids the TracePoint conflict with Zeitwerk, which was reported in ruby#408
* Fix assert_locals_result helper * Allow TracePoint reentry during DAP's evaluation This avoids the TracePoint conflict with Zeitwerk, which was reported in ruby#408
This avoids the TracePoint conflict with Zeitwerk, which was reported in ruby#408
This avoids the TracePoint conflict with Zeitwerk, which was reported in ruby#408
This avoids the TracePoint conflict with Zeitwerk, which was reported in #408
(Originally reported in rails/rails#43691).
The
debug
console prevents Zeitwerk from autoloading constants that are children of namespaces defined in a file. This script reproduces the issue:ruby
is able to autoloadGit::Ref::Collection
just fine.continue
in thedebug
console,test.rb
is also executed correctly.next 4
and typeeval Git::Ref::Collection
you'll get aNameError
.irb
debugger command and try to evaluateGit::Ref::Collection
in that session.In case it matters, when Zeitwerk finds a file like
ref.rb
that defines a namespace, it sets a trace point on the:class
event, because if the file had this content:it should be able to autoload
Git::Ref::Collection
on line 3. In order to provide that feature, Zeitwerk listens to:class
events, and as soon as the one forGit::Ref
is triggered, Zeitwerk lists the contents of allgit/ref
directories in the root paths, and sets autoloads for what it finds. In this case, it would set an autoload forCollection
on the module object stored inGit::Ref
.Problem with
byebug
was that Ruby does not support nested trace points. Are we in the same situation perhaps?The text was updated successfully, but these errors were encountered: