Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Serialize dynamically generated anonymous functions closing over any module (not just Main) #22884

Open
amitmurthy opened this issue Jul 20, 2017 · 6 comments
Assignees
Labels
docs This change adds or pertains to documentation

Comments

@amitmurthy
Copy link
Contributor

The root cause of brian-j-smith/Mamba.jl#109 appears to be that we do not serialize anonymous functions closing over modules other than Main. The reduced test case is shown below.

julia> addprocs(2);

julia> @everywhere module Foo
           mutable struct Bar
               x
               Bar() = new(()->1) 
           end

           make_bar() = (b=Bar(); b.x=()->1; b)
       end;


julia> bar = Foo.Bar();

julia> typeof(bar.x)
Foo.##1#2

julia> remotecall_fetch(typeof, 2, bar.x); # OK

julia> bar = Foo.make_bar();

julia> typeof(bar.x)
Foo.##3#4

julia> remotecall_fetch(typeof, 2, bar.x); # OK

julia> bar.x = eval(Foo, :(()->1));

julia> typeof(bar.x)
Foo.##5#6

julia> remotecall_fetch(typeof, 2, bar.x); # Not OK
ERROR: On worker 2:
UndefVarError: ##5#6 not defined
deserialize_datatype at ./serialize.jl:986
handle_deserialize at ./serialize.jl:687

I think the check for this is at

julia/base/serialize.jl

Lines 507 to 510 in 8cd3c3f

isanonfunction = mod === Main && # only Main
t.super === Function && # only Functions
unsafe_load(unsafe_convert(Ptr{UInt8}, tn.name)) == UInt8('#') && # hidden type
(!isdefined(mod, name) || t != typeof(getfield(mod, name))) # XXX: 95% accurate test for this being an inner function
.

What is the rationale for this limitation? Can it be removed?

@JeffBezanson
Copy link
Sponsor Member

The rationale is that we expect packages to be loaded on all workers, so we can avoid serializing every function every time. Since package code is expected to be loaded everywhere, we can just send the name of the function instead of its full code.

@JeffBezanson
Copy link
Sponsor Member

I should add that calling eval in a "closed" module is generally not considered a good thing.

@amitmurthy
Copy link
Contributor Author

The rationale is that we expect packages to be loaded on all workers, so we can avoid serializing every function every time.

Certainly. This applies to functions defined in the module.

But what about dynamic code? Simple example:

julia> addprocs(2);

julia> @everywhere module Foo
           gen_foo() = eval(:(()->1)) 
           gen_main() = eval(Main, :(()->1)) 
       end;

julia> foo = Foo.gen_foo();

julia> main = Foo.gen_main();

julia> remotecall_fetch(main, 2)
1

julia> remotecall_fetch(foo, 2)
ERROR: On worker 2:
UndefVarError: ##1#2 not defined
deserialize_datatype at ./serialize.jl:986
handle_deserialize at ./serialize.jl:687

Is there a way we can identify dynamically generated code? I think this what Mamba.jl is doing.

I should add that calling eval in a "closed" module is generally not considered a good thing.

Are you suggesting packages dynamically generating code should explicitly evaluate the expression under Main?

@amitmurthy amitmurthy changed the title Serialize anonymous functions closing over any module (not just Main) Serialize dynamically generated anonymous functions closing over any module (not just Main) Jul 20, 2017
@JeffBezanson
Copy link
Sponsor Member

Here's a way to implement this. jl_module_t has a counter field that's used to name "anonymous" functions. We could remember the value when the module closes, and anything with a number higher than that needs to be fully serialized.

However I much more strongly encourage packages not to do this at all. It also breaks precompilation, for example.

@amitmurthy
Copy link
Contributor Author

However I much more strongly encourage packages not to do this at all.

OK. Lets document this guideline then. Leaving the issue open till then.

It also breaks precompilation, for example.

How? Does eval'ing in a module trigger a complete recompilation? Will be good to document this too.

@JeffBezanson
Copy link
Sponsor Member

How?

Because the contents of the module varies from run to run, so the precompiled version of the module might not contain everything you need.

@vtjnash vtjnash added the docs This change adds or pertains to documentation label Aug 31, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docs This change adds or pertains to documentation
Projects
None yet
Development

No branches or pull requests

3 participants