-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
YJIT: Allow upgrading YJIT in released Ruby versions #523
Comments
I'd like to add one pro to 1 (and potentially 2): it would allow to ship a secondary |
The most important thing we want to switch with a Ruby build flag is This is an off topic here, but I gave a thought to the stats build problem again. I came up with an idea of duplicating |
I don't want to sound overly critical, maybe gemifying YJIT would be a good solution, but since this would be a big change, I would like to voice some concerns, and also think about potential alternatives to make sure we look at all possible options.
I know this may be impractical, but I do have to wonder if there wouldn't be some way to have a centralized cache of precompiled gems for prerelease rubies. In theory, there could be a machine associated with rubygems.org which periodically rebuilds the most common gems with Ruby head. It might not be necessary to rebuild said gems for every commit. I know, more infrastructure, sounds hard, but it could also make using unreleased rubies in production much easier?
Does that mean not enabling YJIT by default? We have to be careful because every extra step that we require end users to do to enable YJIT means less people are going to use YJIT. If we want the broadest possible YJIT adoption, we have to make enabling it as simple as possible, and make YJIT be available by default on every system where it's supported. Backporting Regarding having YJIT as a default gem. It's not clear to me what the backporting process would look like to build a YJIT gem for older Ruby releases. The difficulty seems to be that a priori, every released Ruby version would have its own API that it exposes to the YJIT gem, with minor API differences for each version. That would mean we'd need to build multiple YJIT gem versions that work with different Ruby releases. I guess that's a bit simpler than what we currently do in terms of backporting, because it wouldn't be the whole YJIT codebase that changes, but it still means we have to build multiple versions of YJIT. Possibly we would want to use conditional compilation or such? It's non-trivial effort. There are also YJIT features that require changes to CRuby. Things like object shapes, and Alan's work on a faster calling convention. Things like that can't easily be backported, which could make things very iffy in the YJIT codebase. It would be super awkward to have to use conditional compilation to deal with multiple different possible object layouts in the YJIT codebase, for example. I think in theory, having an API for a pluggable JIT sounds nice, but in practice it could turn out to be fairly complex and not that practical. That's the big risk. I would want to know how the CRuby build process changes, whether we could still count on having YJIT be enabled by default (and eventually turned on by default) what the API would look like from the YJIT end, and also to have a clearer picture of what backporting would look like. Alternatives I think it would also be good to keep thinking about other solutions for making the use of unreleased/prereleased Rubies in production easier. Be that some kind of networked build cache... Or just being willing to run 3.3+ on the CI and in production even though that's not what developers are running locally? I do want to point out that we've been running prereleased Ruby commits on SFR for over a year and it's been very reliable, it works. The developers are presumably running 3.2 on their machines and it hasn't been a problem. In theory we could deploy the same newer Ruby commit on all of SFR and it seems like it should work fine, based on past experience. Core is bigger, but could the same thing not be true there? |
Given that nobody else runs Ruby head in production (even GitHub doesn't), they would not build it just for us. We need to build and operate the system ourselves. One practical compromise I thought of now is having a scheduled workflow that polls a set of known slow-build extension gems and pushes binaries for Ruby shas we're currently using to our gem server. Another thing we should consider is binary gem distributions for preview Ruby releases, which seems more practical when we need to persuade the community. It would take time for the community to adopt it on rubygems.org, but we could start doing that in our gem server. I'm open to taking those paths if Approaches 1 and 2 turn out to be impractical, but the gem capability would allow a quicker feedback loop and make us more productive than not having it.
No, this doesn't change. As the name suggests, default gems are built/installed by default. As long as
Approaches 1 and 2 do not really change how we backport patches from With Approach 2, the release process will not change because we don't ship any gem in the first place. With Approach 1, the release managers may bump a gem version if something in yjit.gem has been changed and not released yet. What's different is that we can optionally have a non-stable branch that we don't include in Ruby releases but we use for applications like Core. Besides stable branches like
I did not propose to use a single branch for multiple Ruby versions. YJIT master should only support Ruby master, and each released YJIT version can require a specific Ruby teeny version. You should only backport what makes sense to backport. Memory optimization or code changes that are closed in codegen.rs should be possible. If we're not interested in testing a next version in more production workloads, we can just leave the branch. Right now it's just impossible even if we want to.
which is why I'm suggesting "experimenting with building a yjit.gem to get more ideas" in this issue.
Whether we take Approach 1 or 2, we should be able to do that.
I agree. In particular, as I mentioned early in this comment, we should start thinking about how we could have binary gem distributions for Ruby releases. This would be useful whether YJIT is gemified or not. Once we start using a preview Ruby, we can use it like a new Ruby release. Whenever a preview release is made, we can abandon a non-stable branch and re-fork a preview release, which would make backports easier. When we need to test something that cannot be backported, we can ask the release managers to cut a new preview release. This would be much better than today's situation, so we should definitely do that. At the same time, you'd still need to wait for preview releases when you have something to experiment with in production. Allowing yjit.gem seems like the only practical solution for Shopify to fully control when to unblock a YJIT change in production.
We're on the same page about this. If it's not safe, we shouldn't test newer YJIT changes in production in those environments anyway. There's no hard blocker, but I don't want to make developers unhappy (with slow |
TL;DR for the previous comment: yjit.gem support still seems like the only solution that solves everything we want. We should experiment with the idea to verify its practicality. At the same time, preparing binary gem distributions for preview Ruby releases might be more practical, so it's worth exploring too. |
I initially proposed a build farm for security reasons in early 2021, and I brought it up again just this week. Luis Lavena set up a proof of concept in 2020 for this, but abandoned the work because it's hard to make a general solution for all platforms. I think if we limit our initial goals to "just Shopify arm64-darwin and x86_64-linux systems" it's probably doable, and would address both security and ruby-head use cases. I don't want to get too far into the weeds, but a likely implementation would involved rake-compiler-dock, which I'm co-maintainer of. My experience setting up a build container for a the last few releases of Ruby has been challenging; and I'm not at all sure that it would be easy to keep that environment continuously up-to-date with ruby head as changes are being made. But it might be worth exploring. |
The thing is, we're already deploying unreleased Ruby commits to SFR. It's a process that's relatively simple (from our end) and works well. Core is a bigger app, but in theory it should be possible to have a similar canary setup there too. That IMO, is the simplest solution from a YJIT development standpoint, in terms of developer ergonomics, because there is no backporting, we don't have to maintain multiple branches or make sure that our newer changes work with older Ruby binaries.
What really makes me hesitate with this is that it adds complexity, more moving parts, to the YJIT development process. It means we have to keep track to some degree of multiple CRuby JIT gem API versions, we have to have multiple YJIT branches for different CRuby releases. There's a non-trivial backporting process. Some changes are backportable and some are not. Once changes are backported, we need a CI process to test our backported YJIT gem with older CRuby binary versions before deployment. Would we also need to publish the YJIT gems that we compile to rubygems? If so, we'd also need to document which YJIT gem versions go with which CRuby versions. I guess the YJIT gems could check that the CRuby version is compatible and give you an error message if not.
For me the big pain point is how difficult/annoying the backporting, and having to deal with the coupling of multiple CRuby releases and multiple YJIT gem versions is going to be. I also want to point out. I think we both have our own biases. My own bias is that I think the status quo that we have is not that bad and could be improved in other ways that would possibly be more seamless from a YJIT-development standpoint. You and Aaron are working on RJIT/TenderJIT. You love the idea of a JIT gem because it puts your JIT projects on a more even footing, and it makes it easier to build a Ruby JIT in Ruby. You view the idea of a pluggable JIT gem as a desirable feature in and of itself, whereas I tend think of it as added complexity that we should properly justify before we make such a big change. You seem very motivated to work on this and I think it will most likely happen, but my words of caution are: I think what you are suggesting is definitely feasible, but it may not solve our problems as well as you think it will in practice. Don't just think of the upsides, think of the potential downsides as well. Particularly the human factor and developer experience.
Would you be ok with writing a document in which you try to sketch what the JIT gem API would look like? What are all the things the gem exposes to CRuby? How does CRuby expose functions, constants and struct layouts to the JIT gem? How does one go about running CRuby with an alternative JIT gem version? How do we handle checking for JIT API compatibility? It doesn't need to be perfect, but it would give everyone a more clear picture.
Implement a canary deployment system for Shopify Core that works like our SFR canary/experimental system, which we can point at recent Ruby commits. This solution should work as long as the CI machine(s) have build caches for the gems? Presumably developers don't necessarily have to run the same CRuby sha, because that is not the case with SFR and we've been managing very well. If you couple that with more frequent preview releases that have prebuilt gems, it means we could cut down the time until our YJIT changes make it into production down to 4-6 months, without worrying about backporting. Not saying this solution is perfect, but presumably, it is feasible. It could be a viable alternative to having a JIT gem. The benefits are that the process for testing YJIT changes on Core becomes more like SFR. We don't have to worry about JIT gem compatibility issues. We can also deploy/test CRuby changes regularly. We also test Core on Ruby head regularly, which makes sure that Core remains working on Ruby head, even though this is not what the Core developers are using. |
Did you read the "Details" of the Motivation section? This issue is trying to solve a problem that is not addressed yet. We could possibly increase the canary % or have another cluster to improve it a bit, but it comes with a different trade-off.
This is true and I take it into account carefully, but:
I'd like to point out that the backport process for what we publicly release will not necessarily change and only branches for experimental deployments/versions will have a non-trivial backporting process, which is optional and can be skipped when you don't need it. The way it's solved today is that we simply can't solve it even if we want to.
I clarified in past comments that this doesn't happen with teeny-level version locking.
I actually didn't realize that it'd be easier to build TenderJIT after this since RJIT doesn't really benefit from it. RJIT is already as easy to build as YJIT, so nothing is improved by this project. While Aaron could be biased by that, I am not.
I do too.
because of the "Motivation" I wrote. I want to improve production performance, not for making it easier to build RJIT or TenderJIT. You seem to care more about how Ruby JITs are built, but my first priority is how well YJIT performs in production. I think the difference in our bias comes from the fact that Jean and I spend more time in tuning and monitoring Ruby in production than other folks. I take making
Yeah. We usually have 3 preview releases per version, so some interval may be even shorter. Plus, we'll also have rc releases with relatively short intervals when nearing the release, and that'd be convenient for making sure the next release is in a good shape. This would be the only option for testing Ruby 3.3 release anyway (since we didn't gemify it in 3.2), so we should explore that this year. |
Fair.
I don't think that's necessarily fair. I just view engineering work as trying to find a good compromise/tradeoff between multiple different concerns, that are sometimes pulling in slightly different directions. The YJIT team is a small team, so for me, the amount of complexity that we have to manage vs the time/energy we have is a high priority concern. My main worry is that we make YJIT so complex that we struggle to maintain and develop it. That's not something that happens instantly. It happens over dozens of technical decisions that we make over months and years.
Just to clarify because it can be hard to read tone in text. Me saying that you seem very motivated to work on this is not any kind of attack or criticism. It just means: you seem to really like this idea and you seem excited to just go ahead and do it. I haven't said no. I'm not trying to stop you from doing it. I just want to make sure that we look at all the options and we're all satisfied with the solution(s) we decide to go for, and how we go about it. What I actually think is that we'll likely move forward with the JIT gem solution.
Why not? I don't really understand all the details of how all of this would work. I'm assuming you would end up with one or multiple YJIT gems. Like, say, there could be a yjit-ruby33-v3 gem, or something, and that gem only works with Ruby 3.3. Then you would need a separate gem with its own versioning scheme for Ruby 3.4. You still need to make sure that you pair the correct YJIT gems with the correct Ruby releases. It does introduce (some) potential for user confusion if you allow end users to swap the YJIT gem by changing one line in a gem file, no? Let's try to imagine how this would look from the perspective of end users. What would it look like to deploy a newer YJIT gem for a released version of Ruby 3.3? One thing that I think could potentially be very cool, if it were possible, is if we could seamlessly deploy a custom Ruby 3.3 binary that comes with an upgraded YJIT gem. Maybe even with YJIT enabled by default, but also works with all the precompiled Ruby 3.3 gems. Could we get to a point where someone can use ruby-build or ruby-install to install/deploy a version of CRuby with an upgraded YJIT gem (and maybe YJIT enabled by default)? If that's not viable, how would it work then? Could you specify which YJIT gem to use through an environment variable? Would you have to edit a gem file? |
Very true 👍
Yeah, I appreciate all your input. You're making my effort of filing this issue really meaningful.
I admit that I'm discussing something not straightforward, so let me rephrase it. Basically, I imagine to have two different kinds of YJIT releases for past Ruby versions:
Does this make sense? |
I think I was imagining something different, because at some point you (or someone else) mentioned loading an alternative JIT gem by changing a line in a gem file. So, I was imagining that there would be publicly available YJIT gem releases that match up with specific Ruby release versions.
I got the impression that this is really the main goal. For us to deploy the latest YJIT advances to Core/SFR faster. There is still some backporting effort required, but let's assume that the effort is less because the JIT API exposed by CRuby would presumably be fairly stable over time. I do still want to ask though (just humor me), why is it that much better to have a YJIT gem and backport than to just backport the latest I'm also wondering if there is some way that we can somehow make the broader community (e.g. GitHub) benefit as well, by somehow having our own CRuby 3.3.1 + yjit_gem_latest release. In terms of actionable steps, let's assume that we move forward with a YJIT gem. I still think it would be good to sketch the JIT API (what is exposed by YJIT, what is exposed by CRuby) in a document, and also to answer questions about what the backporting and deployment process would look like. If you go through this exercise, I think it will give us all a clearer picture, and assuming we do go forward, you will write better code once you've taken the time to sketch and map out the idea first. |
It is an optional benefit Approach 1 can achieve with relative ease. It can still have
Hmm, I assumed we don't have subtree git history like https://github.com/ruby/ruby/commits/master/yjit, but we do. It might not be as inconvenient as I initially imagined. Being able to use the standard procedure on ruby/yjit just like other ruby/xxx repositories (e.g. ruby/yarp) would be still more intuitive/convenient for some folks, but maybe only slightly. Another thing is, it currently feels uncomfortable to backport experimental stuff to We could maybe have an experimental |
I discussed the idea of deploying a preview release (maybe preview2 when it's released since preview1 is already too old) to Core and SFR globally before BFCM with Mike and Jean today. Given that CI and So for the problem described in the issue description, I'm going to use the combination of Solution 3 (#525), which I've been doing in the last three weeks, and the preview release idea for the foreseeable future. @tenderworks Since I lost the use case for it, I'll leave it for you to file an issue about the design of YJIT gemification if you're still interested. I have ideas, but I bet you'd think of something similar. |
Problem
Core and 99% of SFR use Ruby 3.2 for YJIT deployment. As a result, you need to wait until Ruby 3.3 is released to test the latest YJIT on Core or with more workloads on SFR. A shorter feedback loop would be desirable to improve the quality of YJIT releases.
Motivation
I wish we had this feature in Ruby 3.2 when I looked at SFR production metrics.
On average, Ruby 3.2 generates smaller code on SFR than Ruby 3.3 because Ruby 3.3 has better
ratio_in_yjit
, which is a good thing.However, if you look at the maximum, Ruby 3.2 generates a lot more code than Ruby 3.3. I believe this is because Ruby 3.3 is exposed to only 1% of the traffic and serves fewer varieties of requests. The more traffic, the better testing.
Because of that, code GC runs regularly on Ruby 3.2 while it never runs on Ruby 3.3.
Since Ruby 3.3 is not tested under SFR's worst case scenario yet, I don't know what will happen to that workload when it's actually released.
For Core, while we tuned Ruby 3.2 YJIT after the release and adjusted the default
--yjit-exec-mem-size
much later, it would have been a better release if we were able to experiment with that before released.Background
One might think you should just upgrade the main Ruby version in those repositories to Ruby 3.3. It's technically possible, but it would leave the problem of significantly slowing down
bundle install
in local development, which would be bad for the productivity of developers.Some C extensions, especially the ones that are slow to build, have binary releases for released Rubies. You need to build them from scratch for unreleased Rubies. When a new release of a third-party gem comes out and a developer tries to upgrade the version locally, there's no way the gem publisher have prepared a binary release for a random Ruby sha we're using.
Solutions
I'm not married to any specific approach yet, but right now I think it's worth experimenting with building a yjit.gem to get more ideas about the actual blockers for Approaches 1 and 2. Then, if we learn it's practical enough from the experiment, we could take Approach 2 as a small start.
1. Implement YJIT as a default gem
A default gem is a gem that is usable without extra steps and hard-copied to the ruby/ruby repository. Even if we make YJIT a default gem, it should not change how YJIT is built when installing Ruby. You need
rustc
when building Ruby and it should be available if you pass--yjit
. Default gems are upgraded when newer Rubies are released. You don't need to treat them as a gem unless you need to override the gem version that is released by core developers.If we go with this path,
required_ruby_version
of yjit.gem should be very constrained. YJIT for Ruby 3.3 should be allowed only on Ruby 3.3.x. @paracycle even suggested to lock it at a teeny level (e.g. 3.3.0), and I think it's a good option too. This means that we'll need to maintain different YJIT branches for every Ruby minor versions, but note that we already do that inruby_3_1
andruby_3_2
branches of ruby/ruby. This doesn't necessarily introduce extra maintenance targets unless we wish to do so.pros
rustc
installation optional (@paracycle's idea)rustc
available, users may choose to retrofit YJIT by writinggem "yjit"
in Gemfile.cons
2. Implement YJIT as an extension library
Even if we don't gemify YJIT, we could tweak the interpreter to lazily load the YJIT implementation through
require "yjit.so"
. This is how YARP currently works in ruby/ruby. If the interpreter doesn'trequire "yjit.so"
on--yjit-pause
and lazily calls it onRubyVM::YJIT.resume
, we could mutate$LOAD_PATH
to replace theyjit.so
implementation using an arbitrary gem. Then we could develop and use yjit.gem like Approach 1 while never gemifying YJIT in the interpreter implementation.The main difference between Approaches 1 and 2 is that
yjit.so
cannot be replaced by just writinggem "yjit"
in Gemfile. You need to do some extra setup on$LOAD_PATH
to replaceyjit.so
, which is fine for our team but not convenient for most other people. Anyway, if you want to avoid gemifying YJIT in Ruby releases while still allowing us to replace it with a yjit.gem, this is how.pros
cons
3. Use pshopify to backport more YJIT features
We kind of address the problem already by "pshopify" releases in Shopify/ruby-definitions. In fact, our Ruby 3.2 build has an optimization that was backported from Ruby 3.3 and then stayed while it was supposed to be a temporary experiment.
This could work as a last resort. We might actually want to try this for Ruby 3.2. But for future Rubies, taking other approaches would make the backport work simpler and easier, which would lead us to test YJIT changes in those environments more often.
pros
cons
The text was updated successfully, but these errors were encountered: