Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Return callback module in supervisor format_status #1000

Merged
merged 1 commit into from
Jun 8, 2016

Conversation

josevalim
Copy link
Contributor

The previous implementation of supervisor:get_callback_module/1
used sys:get_status/1 to get the supervisor inner state and
retrieve the callback module. Such implementation forbids any
other supervisor implementation that has an internal state
different than the #state{} record in supervisor.erl.

This patch allows supervisors to return the callback module
as part of the sys:get_status/1 data, no longer coupling the
callback module implementation with the inner #state{} record.

Notice we have kept the clause matching the previous
sys:get_status/1 reply for backwards compatibility purposes.

@josevalim josevalim force-pushed the jv-get-callback-module branch from 7c0733f to 9ef5502 Compare March 31, 2016 09:07
@OTP-Maintainer
Copy link

Patch has passed first testings and has been assigned to be reviewed


I am a script, I am not human


@sirihansen
Copy link
Contributor

The addition of supervisor:format_status/2 in this patch makes the patch backwards incompatible in the sense that sys:get_status(SomeSupervisor) will no longer return the same as in previous releases. Is this addition really necessary for the purpose of patch?

@josevalim
Copy link
Contributor Author

@sirihansen that makes sense.

I can definitely change this but before I would like to discuss if the format I am currently adopting is our best choice. I can think of at least three:

  1. Returning a tuple in the state (the one I used in this PR):

    [{data, [{"State", {Module, State}}]}]
    
  2. Introducing another entry to data:

    [{data, [{"State", State}, {"Supervisor", Module}]}]
    
  3. Introducing another entry beyond data:

    [{data, [{"State", State}]}, {supervisor, [{"Module", Module}]}]
    

In particular, 2 and 3 could be backwards compatible if the calling code is matching on [{"State", State} | _] instead of simply [{"State", State}] or similar.

Another alternative is to not rely on sys:get_status/1 but use proc_lib:initial_call/1:

{supervisor, Mod, _Args} = proc_lib:initial_call(Pid).

Any preference on how to move this forward? I am ready to work on this since this feature is quite important for us. :) Thank you!

@sirihansen
Copy link
Contributor

We have discussed this for a while, and still no conclusion :( Just wondering how you incorporate the "other supervisor implementation" in the system? Is the alternative with proc_lib:initial_call/1 really possible?

Except for this, our thoughts right now are something like

  • Alternative 2. and 3. are better than 1. , but we are a bit afraid of this change, mainly because the code is so old and we don't really understand why the recommended data structure is as it is. It might, however, be a good idea to allow behaviors to return some properties here.
  • proc_lib:initial_call/1 will gain us some performance, and this solution is backwards compatible. But the returned value for supervisors is a bit strange - the documentation says that the function shall return {Mod,Func,Args}, but for supervisors it returns {Mod,SupervisorMod,Args} - which is not documented anywhere. The value comes from proc_lib:trans_init/3, and this is where I wonder if it will really work in your case??

@sirihansen
Copy link
Contributor

We also thought about the possibility of just doing a call to the supervisor, instead of using sys or proc_lib. But I don't think we can guarantee that there is no suspended supervisor anywhere, so it seems too dangerous to do it this way.

@josevalim
Copy link
Contributor Author

Alternative 2. and 3. are better than 1. , but we are a bit afraid of this change, mainly because the code is so old and we don't really understand why the recommended data structure is as it is. It might, however, be a good idea to allow behaviors to return some properties here.

You are probably aware but, if it helps, both 2 and 3 are done by gen_event: https://github.com/erlang/otp/blob/maint/lib/stdlib/src/gen_event.erl#L772-L775

proc_lib:initial_call/1 will gain us some performance, and this solution is backwards compatible. But the returned value for supervisors is a bit strange

It would work in my case because I am setting :"$initial_call" to the same format the supervisor would. Even though it is not documented, I decided to use the same format because tools may rely on it.

Since some context may be helpful, the supervisor I am implementing is just the simple_one_for_one bits of the regular supervisor, so internally it behaves pretty much the same as the built-in one.

That said, although proc_lib:initial_call/1 is not ideal, I believe it is better than the approach we have today because it at least gives those implementing supervisors very similar to the built-in ones some leeway.

Thank you for the follow up!

@josevalim
Copy link
Contributor Author

I have sent a new PR with the proc_lib approach to facilitate the discussion: #1079.

@sirihansen
Copy link
Contributor

I'm sorry for the slow progress, but finally we have come to a decision - we will use alternative 3, slightly modified:

[{data, [{"State", State}]}, {supervisor, [{"Callback", Module}]}]

And we will keep your new format_status implementation in supervisor - so we will flag it as incompatible in the release note. This means we have to go through all calls to sys:get_status in all applications to see if there are any bad matches.

How soon would you be able to do this, @josevalim ?

@josevalim
Copy link
Contributor Author

I'm sorry for the slow progress, but finally we have come to a decision

No problem! I am on it right now. :)

@josevalim josevalim force-pushed the jv-get-callback-module branch from 9ef5502 to 552d389 Compare June 3, 2016 12:53
@josevalim
Copy link
Contributor Author

josevalim commented Jun 3, 2016

@sirihansen I have pushed the new code. Some decisions I took:

  1. I used lists:keyfind/3 in the new implementation so we are not order dependent
  2. I still match on the previous return value for backwards compatibility
  3. I have tested it also works on my custom supervisor implementation

Regarding the usage of sys:get_status/2, very little code in src invokes it. The only place where we would potentially need to change is in a module called si_sasl_supp but I could not make the si functionality work on master nor in the previous Erlang 18 release.

It is also worth pointing out many places in tests use sys:get_status/2 but all of them traverse Misc instead of matching when necessary. Example: https://github.com/erlang/otp/blob/maint/lib/ssl/test/ssl_test_lib.erl#L1040.

@sirihansen
Copy link
Contributor

Great, thanks! I have now included the branch in our nightly tests. I'll have a closer look at si_sasl_sup etc as soon as possible.

@sirihansen
Copy link
Contributor

Ah - there is at least one place in release_handler_SUITE where this will fail.. It calls sys:get_status via rpc:call, so that might be a reason why you did not see it... Don't know if there are other places where this is done, or maybe apply/3 is used somewhere??

@sirihansen
Copy link
Contributor

And please have a look at observer_procinfo.erl as well.

@josevalim josevalim force-pushed the jv-get-callback-module branch from 552d389 to fac9af0 Compare June 3, 2016 16:33
@josevalim
Copy link
Contributor Author

@sirihansen great catch! I have run a search for "sys, get_status" now and I have analyzed the new results. It seems we don't parse the release_handler_SUITE value in anyway, so the current usage in there is OK?

Other than that, I have fixed observer_procinfo.erl and verified my changes, it was indeed broken before but now it works again as expected). I have also found another bug where a supervisor is reported as a gen_server behaviour because of the weird proc_lib:initial_call behaviour we discussed. I will sent another PR for this though.

@sirihansen
Copy link
Contributor

Great, I'll update the branch with your changes, but there is indeed a problem in release_handler_SUITE. From yesterdays test - failed on all test machines:

=== Ended at 2016-06-03 22:41:59
=== Location: [{release_handler_SUITE,upgrade_supervisor,1373},
              {test_server,ts_tc,1529},
              {test_server,run_test_case_eval1,1045},
              {test_server,run_test_case_eval,977}]
=== Reason: no match of right hand side value 
                 {status,<15680.79.0>,
                     {module,gen_server},
                     [[{'$initial_call',{supervisor,a_sup,1}},
                       {'$ancestors',[<15680.78.0>]}],
                      running,<15680.78.0>,[],
                      [{header,"Status for generic server a_sup"},
                       {data,
                           [{"Status",running},
                            {"Parent",<15680.78.0>},
                            {"Logged events",[]}]},
                       {data,
                           [{"State",
                             {state,
                                 {local,a_sup},
                                 one_for_all,
                                 [{child,<15680.80.0>,a,
                                      {a,start_link,[]},
                                      permanent,brutal_kill,worker,
                                      [a]}],
                                 undefined,4,3600,[],0,a_sup,[]}}]},
                       {supervisor,[{"Callback",a_sup}]}]]}
  in function  release_handler_SUITE:upgrade_supervisor/1 (release_handler_SUITE.erl, line 1373)
  in call from test_server:ts_tc/3 (test_server.erl, line 1529)
  in call from test_server:run_test_case_eval1/6 (test_server.erl, line 1045)
  in call from test_server:run_test_case_eval/9 (test_server.erl, line 977)

The previous implementation of supervisor:get_callback_module/1
used sys:get_status/1 to get the supervisor inner state and
retrieve the callback module. Such implementation forbids any
other supervisor implementation that has an internal state
different than the #state{} record in supervisor.erl.

This patch allows supervisors to return the callback module
as part of the sys:get_status/1 data, no longer coupling the
callback module implementation with the inner #state{} record.

Notice we have kept the clause matching the previous
sys:get_status/1 reply for backwards compatibility purposes.
@josevalim josevalim force-pushed the jv-get-callback-module branch from fac9af0 to e257753 Compare June 4, 2016 21:00
@josevalim
Copy link
Contributor Author

josevalim commented Jun 4, 2016

Ouch, the reason I did not find this one is because I searched for "sys, get_status" and this one is using "sys,get_status". 😨 I have fixed it and force pushed once more! Thanks for all the feedback!

@sirihansen
Copy link
Contributor

ok, let's hope it was the last one :) I've included the latest in the tests now.

@OTP-Maintainer
Copy link

Patch has passed first testings and has been assigned to be reviewed


I am a script, I am not human


@sirihansen sirihansen merged commit e257753 into erlang:master Jun 8, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants