-
Notifications
You must be signed in to change notification settings - Fork 104
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add user-definable hook to prevent objects from being GC'ed #234
base: master
Are you sure you want to change the base?
Conversation
This makes it possible to hook object destruction in the GC, so library users can run code that determines whether the object is actually ready for cleanup. If not, the garbage collector will not collect the object.
For my understanding:
Where does the cycle come from? Does the Nim object hold references to one or more of those JS objects? If so, would ephemerons or weak references work for your use case? |
Or, for a different perspective, would lifecycle management work for your program if the JS objects were stored as keys with the Nim object as their value in a WeakMap? |
e.g. this evaluates to true: document.querySelector("html") === document.querySelector("html") because querySelector returns the same JS object for both calls.
The cycle comes from the fact that Nim holds a reference to the JS object created for it, if said JS object exists. But also, that JS object must hold a reference to the Nim object for the case when JS outlives Nim. It should be (conceptually) a pair of regular strong references that still participates in cycle collection.
Putting a weak reference to any side has the problem that whichever side holds the strong reference can go out of scope and be destroyed while still being referenced by the other side.
No, with a WeakMap the JS object could still be collected while the Nim object is alive. Whenever a JS object pair is created for a Nim object, it must stay alive while the Nim object is referenced by Nim code, but also, the Nim object must stay alive while the JS object is referenced by JS code. To give a more concrete example of what the can_destroy hook enables: A JS object is created whenever a Nim object is referenced from JS code and there is no JS counterpart of said Nim object yet. This does not necessarily have to happen in a constructor. e.g. say we have an HTML document like this: <div id=test>hello, world</div> Also, that there is a querySelector Nim function exposed to JS. Now, if we do this: /* (1. it is implied that this is in a <script> tag after the previously
* described HTML.) */
let test = document.querySelector("#test"); /* 2. create new JS object */
console.log(document.querySelector("#test") === test); /* 3. true */
test.remove(); /* 4. detach #test from document; Nim no longer holds a reference,
* BUT we can still do this: */
console.log(test); /* [object Element] */
test.jscanary = "chirp"; /* add a member to the JS object */
/* 5. and we can re-attach it to the document */
document.querySelector("body").appendChild(test);
/* 6. and delete it in JS */
test = null;
/* 7. and then, magic happens */
console.log(document.querySelector("#test").jscanary); /* "chirp" */
/* 8. clean up */
document.querySelector("#test").remove(); In this example, the objects are created as follows:
Footnotes
|
That example clears it up, thanks. Do I summarize correctly that the ultimate goal is to ensure that mutations to the JS object persist across incarnations? If you take a different tack you can probably make that work without quickjs changes, by defining JSClassExoticMethods for your objects. That turns them into transparent proxies and lets you store properties on the Nim side. That's how the DOM in Chrome or the vm module in Node.js work, basically. I'm somewhat reluctant to add a special case to the GC but, if exotic methods don't work for you, there's probably a way to make finalizers general enough to cover your use case. |
Yes. Also, I want them to be identical in JS terms. (I can deal with the JS object pointers being different, but only if the scripts do not know of this.)
I originally did not take this approach because it looked like I would have to re-implement GetProperty etc. myself. One idea of how to implement this I've just had is:
...but after further consideration, I think this is broken. let obj1 = document.querySelector("#obj1");
let obj2 = document.querySelector("#obj2");
obj1.strongref = obj2;
obj2.strongref = obj1;
obj1.remove();
obj2.remove();
obj1 = null;
obj2 = null;
/* Memory leak? */ Please correct me if I'm wrong, but I believe if the cycle collector now tries to free obj1 & obj2, it won't work because the strong refs obj1_Nim -> obj1_JSinternal -> obj2_JSinterface -> obj2_Nim -> obj2_JSinternal -> obj1_JSinterface -> obj1_Nim still exist. Or if we just cut the internal thing: It seems very familiar... I don't think this solves the problem :( Less importantly, I think this would break too: const x = new WeakMap()
x.set(document.querySelector("html"), "hello, world");
console.log(x.get(document.querySelector("html"))); /* hello, world */ because the interface object is recreated for each conversion.
I suppose Chrome and Node.js do not use objects from two separate GCs? (I'm not familiar with Chrome/Node.js internals, so please correct me if I'm wrong. In that case I guess I should just go study those codebases :P)
I'd be happy to get this working without that ugly tmp_hook_obj_list handling inside the otherwise rather neat GC :) Unfortunately this is the only solution I could come up with so far that also works. FWIW, before patching QJS I used to do this:
Aside from the obvious inefficiency of unnecessarily copying the Nim object, this almost worked. I say almost, because Nim finalizers don't really support memory allocation inside them, so I was getting random crashes. The obvious idea is to do something similar with QJS finalizers, but we'd have to figure out a way to restore children. IIRC last time I checked my conclusion was "this is impossible because of how the cycle collector works." But maybe I'm wrong. |
I believe the internal/external object approach should be memory leak-free if you store the internal object in a WeakMap, keyed on the external object. I'd consider it a bug if that leaks.
They don't. That's a fair point. (Chrome has oilpan a.k.a. cppgc, an integrated C++ & JS garbage collector. Node.js is a mix of weak references and manual reference counting but may switch to oilpan eventually: nodejs/node#40786)
Would it help if finalizers could either:
|
Hmm. Currently, this is how I understand it: The Nim object must hold a strong ref to the internal object, or the internal object could be destroyed while the Nim object still exists. So when internal indirectly references external, collecting that cycle would involve the Nim object in QJS cycle collection, and that's broken. But I have no WeakMap from external -> internal here, so I'm probably misunderstanding you.
Only if there is a guarantee that the object's properties are still alive; then I could just copy the properties to a new JS object. Though we would still be left with the problem that JS weak refs to the old object would become invalid. (And obviously doing this is sub-optimal concerning performance.)
That would be the best, yes :) The can_destroy hook in this patch serves the same purpose, so if finalizers could also do this then that would be perfect for my use case. |
The WeakMap is used to break the cycle, so I think it should Just Work(TM), including your other example with the circular dependency. Apropos reviving in finalizers: I'll need to think about what guarantees quickjs can make, what invariants need to be upheld, etc. |
Hi,
I'm considering switching my project's1 vendored QJS fork to QJS-ng. For this, I would need the following patch for adding a callback that allows cooperation with the QJS garbage collector.
Background: my project is written in Nim, and has a facility to automatically convert Nim objects (managed by the Nim garbage collector) to JS objects as follows:
The problem lies in resource deallocation. The naive idea of using finalizers does not work because of cycles:
Instead, my solution uses a hook in the QJS GC:
(So basically, ownership after creating a platform object from JS is JS -> Nim. Ownership after JS would have been freed is Nim -> JS.)
The idea remains the same for cycle collection: for all objects that would be deleted, restore them if the callback returns true. Changes to the current cycle collector are:
tmp_hook_obj_list
.gc_decref
, put every object with a callback hook intotmp_hook_obj_list
instead oftmp_obj_list
.gc_scan
, go through every obj oftmp_hook_obj_list
:tmp_obj_list
; destruction of this object proceeds as normalgc_obj_list
(and re-add the refcount to them)Performance considerations: the cycle collector modification is O(N) complexity where N is the number of objects with a hook. If the hook is not used, the only change is an additional NULL check every time an object is freed, and a NULL check for each object before being placed in
tmp_obj_list
.Note: I guess this feature should be mentioned in the manual, I'll add it if this proposal gets accepted at all :)
Footnotes
https://sr.ht/~bptato/chawan/ ↩