-
Notifications
You must be signed in to change notification settings - Fork 115
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ability to create Bump from pre-existing memory allocation? #237
Comments
This could make sense, but I'm not sure your use case actually requires this functionality?
You should be able to call The other concern is re-entry: JS calls Wasm calls JS calls Wasm again. In this scenario you explicitly don't want to be reusing the same memory because you could trample on an earlier call frame's data if you did |
Sorry I've been so slow to come back to you. Your response on #234 has kicked off a lot of thoughts (and I've had to read up a lot to get to a point of understanding). To give some context, this is what I'm trying to do: oxc-project/oxc#2409 That's very long, so to boil it down a bit: Oxc parser is a JavaScript parser written in Rust. We're working on NodeJS bindings for it, using napi-rs. JS ASTs are huge, heavily nested structs, and the serialization/deserialization required to transfer them from Rust to JS (currently via serde JSON) is a massive performance hit - roughly 10x the cost of the parsing itself. Oxc stores the entire AST in a bumpalo It works, and performance is good, but there's a couple of sticking points:
The solution I have in mind is:
NB: Not actually using WASM at this point, but The only missing element for this to work is step (3), which would need a It may well be that in long term Oxc should look at replacing bumpalo with a custom arena allocator tuned for our particular use case. But for purposes of developing a prototype, it'd be preferable to avoid doing that work upfront. I'm aware writing an allocator is no small task! If the above makes sense, and you'd be willing to receive a PR to add If you see problems with what I'm suggesting, or a better way to tackle the problem, would also very much appreciate your thoughts. |
Orthogonal to this discussion but:
FWIW, unless this is a Node-specific property, there is no guarantee that Wasm memory is 4GiB-aligned. The maximum size of a 32-bit Wasm memory is 4GiB, but that doesn't have anything to do with its alignment. (Writing a proper, on topic reply now) |
Yes, point taken. But it appears to be reliably the case in NodeJS. I assume this is probably motivated by being able to efficiently convert 32-bit pointers in WASM into "real" 64-bit pointers. This property may not always hold in future, and we should probably replace it later on, or provide a less efficient fallback deserializer which doesn't rely on it. But for a prototype it seems like a good starting point. |
Do you need to allocate the memory that way? You can't let Because while I could imagine a
I guess, even given those constraints, if you were careful never to actually do anything that deallocates the bump's chunk, you could make something that worked with your scheme:
This could all work, but does seem pretty fragile to me. Especially the Long term, I don't want to always use the global allocator for But now that I say that, we could prototype this scheme with the
then I could consider merging it. But I really want to emphasize that this would have be structured to be as uninvasive as possible and not affect all of Is this something you are interested in exploring? |
Thanks for giving this consideration and replying at length. Really appreciate it.
No I don't need to, but... Main problem is that Also: In my scheme, JS is the ultimate "driver" of the whole process, so in my mind it makes sense for JS to own the memory, and rely on JS's garbage collector to free it. There's also a technical limitation: Node-API (the NodeJS-native interop layer) also has a very annoying problem with objects which are created in native code. If you allocate an object/block of memory in Rust, and pass a reference to it to JS, then once all references to it get garbage collected, JS calls the finalizer in Rust to say "I'm done with it, you can free it now." BUT... actually it doesn't happen like that. The finalizer in fact does not get called until the next turn of JS's event loop, which can be much later if the JS code is all synchronous. Therefore user-facing JS APIs which make large allocations in Rust are a footgun - makes it very easy to exhaust the system's memory by calling them in a synchronous loop. Secondly, some of the JS-side deserializer may later be converted to WASM for speed, so using a
I was imagining Caller would be responsible for ensuring the pointer is aligned correctly, and that the
On the requirements:
See above. I'd hope
Yes, this is a pain, but necessary.
I was imagining creating a 4 GiB memory block to start with, and setting a maximum size of 4 GiB on the Bump with We can accept a hard OOM failure if exhaust the 4 GiB allocation. In practice, no sane JavaScript file would produce an AST anything approaching this size. And Oxc also runs in WASM where 4 GiB is the limit anyway.
I totally agree that trying to manually construct a
Honest answer: maybe. At this point, I'm at the proof-of-concept stage with the AST transfer mechanism, so my priority is to knock something together fairly quickly, and just see how it works. If it turns out the mechanism I've outlined is the most performant way, then yes I might will be interested to then "do it properly", and tackle the But for now I think I'd be better off either temporarily forking bumpalo or doing the horrid DIY Or, if you'd be willing to accept a PR for |
Ah yes, it would. The allocations limit stuff slipped my mind. |
Yeah, I'd rather go the custom allocator route. |
OK, fair enough. I'll go ahead with one of the hacky methods for now and see if it works. May return to this and try to implement custom allocators (within the requirements you've given) further down the road. Thanks again for walking through this with me and exploring possibilities. |
Related to #100.
Would you consider addition of an unsafe method (called e.g.
from_raw_parts
) to create a Bump from an existing memory allocation?In comparison to the request in #100, the caller would have to promise that the allocation was made from the global allocator, with the alignment that Bumpalo requires, and that this allocation now becomes owned by the Bump. So then no changes required to Bumpalo's internals.
One circumstance where this would be useful is Rust-Javascript interop, where it'd enable re-using the same Bump for each call into Rust, rather than calling into the global allocator to create a fresh one each time.
The text was updated successfully, but these errors were encountered: