Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Current use of Return::MultiLevel silently and unexpectedly breaks program flow #1125

Closed
ribasushi opened this issue Feb 3, 2016 · 34 comments

Comments

@ribasushi
Copy link

While on its surface #485 is a good idea, it comes with a huge footgun as can be seen here

The real-life implication of this is that anyone following the advice in Dancer2::Plugin::LogReport completely breaks their DBIC retry logic and possibly other parts of their model, as the internals are not prepared for longjmp's from the middle of its exception stacks.

While detecting the problem is relatively easy, protecting a large project from this is extremely involved (and expensive performance-wise).

I strongly recommend that the use of Return::MultiLevel is re-thought in light of this.

Highlighting @markov2 and @abeverley in case they decide to change D2::P::LR ahead of D2 itself.

@ribasushi
Copy link
Author

Also highlighting @melo (who had the safer more conservative idea back then #432 (comment)) and @mauke (in case he has any thoughts on how R::ML itself can improve to aid with this)

@xsawyerx
Copy link
Member

xsawyerx commented Feb 3, 2016

@ribasushi You missed a later comment.

@ribasushi
Copy link
Author

@xsawyerx Indeed I have. Thoughts on the rest of the ticket?

@xsawyerx
Copy link
Member

xsawyerx commented Feb 3, 2016

Honestly, I don't think I understand it.
Is it that "Return::MultiLevel causes a return from a situation that would have been handled by DBIx::Class"?

I'm not that fast on my feet today. :/

@ribasushi
Copy link
Author

"Flow control based on recoverable errors" is a thing we all use a lot. Let's take an example from Dancer2 itself. Now imagine that someone happened to do:

$SIG{__DIE__} = sub { ... panic (as in D2::P::LR) }

Now your soft-failure handler will never be reached, because we just sent the user to the homepage. You (correctly I must add) did not expect that the eval{} could never return. Dancer2 broke this implicit contract by using R::ML.

One might argue that the user hooking $SIG{__DIE__} is a moron, and that's half-true. But often there is no alternative. The DBIC-specific hook for this functionality also fired on all errors just like $SIG{__DIE__} did (both recoverable and non-recoverable). I have a fix (listed at the beginning of the ticket) but it's neither elegant, nor simple, nor viable to implement in any and all codebases correctly.

Does this explanation help?

@markov2
Copy link

markov2 commented Feb 4, 2016

I agree that $SIG{DIE} is very dangerous, as all side effects are. It is not needed for D2::P::LR when used with Log::Report::try() because that will decide (accidental) dies into exception objects. You should never need the given example.

For me, the hardest thing in programming is the program flow for handling/recovering error conditions. How can you implement that correctly when the program itself can escape from logic via Return::MultiLevel?

I do not known much about Dancer2 internals. Isn't it possible to let 'forward' come back to the place where it is called? return $app->forward($new_route, $request)

@abeverley
Copy link
Contributor

In terms of the plugin, I will look and see if we can change its behaviour to return a rendered error page instead of executing a redirect.

Nonetheless, the problems highlighted by this issue do seem rather undesirable. We are effectively inserting goto statements into people's code, without them realising it.

To give another (probably related) example, the forking code at the following link is sometimes leaving unreaped zombies. It is so intermittent that I haven't been able to find the cause, but strace tells me that the zombie is sat there waiting for web requests, so I can only assume that a redirect is causing a code jump and never allowing the _exit to execute.

https://github.com/ctrlo/GADS/blob/e4e4e87ec0ddb0b156b57371fd25f977c4d3f3bc/lib/GADS/Record.pm#L929-L930

@xsawyerx
Copy link
Member

xsawyerx commented Feb 4, 2016

@markov2 What Return::MultiLevel provides is the removal of the return statement. This gives the DSL a smooth experience. Also, it prevents misunderstands ("why doesn't send_error work?" "Because you need to call return" <- that doesn't exist anymore).

@kentfredric
Copy link

Also, it prevents misunderstands ("why doesn't send_error work?" "Because you need to call return" <- that doesn't exist anymore).

IME, there are just some problems where coding around the users lack of understanding is not the right choice.

You're constrained by the design of the language and the mechanics of that language, if the user doesn't understand why their programming language doesn't do what they expect it to do, they should change their expectations.

Radically changing the language ( and breaking the assumptions made by all code written in that language ) to cater to bad expectations seems like a bad trade-off.

@hvoers
Copy link

hvoers commented Feb 4, 2016

On Thu, 4 Feb 2016, Sawyer X wrote:

@markov2 What Return::MultiLevel provides is the removal of the return statement. This gives the DSL a smooth experience. Also, it prevents misunderstands ("why doesn't send_error work?" "Because you need to call return" <- that doesn't exist anymore).

[FWIW]
In my code I love the 'return'. It shows where the execution stops.

return redirect '/foo/bar' if some_condition;
return template( ... ) if some_other_condition;
return template( other ...);

@veryrusty
Copy link
Member

Separation of concerns is also just as important (IME). Logging your apps' (model) interactions should be independent of any HTTP request. It looks like D2:P:LR tangles up these concerns, doing a HTTP redirect inside a logging handler. The use of Return::MultiLevel implicitly adds to that spaghetti.

I haven't had a chance to verify if the existing engine.log.after and/or core.error.after hooks can be leveraged to do the redirect/forward after the log handler has returned. Better docs for plugin authors about our use of Return::MultiLevel may help make the footguns more obvious too.

@xsawyerx
Copy link
Member

xsawyerx commented Feb 5, 2016

Other than the fact that Dancer2's DSL is DSL (and is supposed to be more than just syntax, but a mini language onto its own), and ignoring the fact that Dancer2 actually uses its shiny (for different values of "shiny", I suppose) syntax to attract non-Perl developers (relatively successfully, IMHO), removing Return::MultiLevel or any other mechanism for jumping out of the calling subroutine will effectively break a lot of application. This is not likely to happen, merely by that clause, whether we have opinions in other ways or not.

We cannot just break a substantial amount of applications.

@mauke
Copy link
Contributor

mauke commented Feb 5, 2016

Here's some random thoughts of mine:

I'm not familiar with Dancer2 or DBIx::Class. I've read bits of this discussion and some of the documentation.

DBIC's exception_action must throw an exception. If D2::P::LR doesn't do that, then it's breaking the contract. So on the face of it, this seems like a (documentation?) bug in D2::P::LR.

Here are some other things that the exception_action callback could do:

  • Return normally. This also breaks the contract but is trivial to detect and handle.
  • Throw an exception. This is what it's supposed to do.
  • Do a non-local return via goto/next/last/redo or XS shenanigans such as unwind from Scope::Upper. This is easy to detect using a scope guard but there's not much you can do about it.
  • Exit the process with exit, POSIX::_exit, or kill, or load another program into the process with exec. Not much you can do about that either. exit will process END handlers but by that time it's far too late.
  • Simply never return, just call the app's main loop again (this is my favorite). Impossible to handle because the handler code doesn't get to run, yet the stack isn't unwound so scope guards remain dormant. (Hmm. How does DBIC handle being called from within an exception_action, possibly recursively?)

None of these are "radically changing the language"; they're just what the mechanics of Perl allow.

That said, I don't understand what the purpose of exception_action is. It seems to be designed analoguously to $SIG{__DIE__}, getting invoked whenever an exception is thrown, no matter whether it is caught/handled or not. In particular, from this thread it seems like exception_action is used both for internal exceptions (flow control within DBIC, doesn't reach user code) and external exceptions (thrown out from DBIC to user code).

What's the point of using exception_action for internal exceptions? Why should I as a user care about what exception class is used internally by DBIC if I never see those exceptions? I would've expected DBIC to use its own exceptions internally and only call exception_action to propagate errors to the user, at which point you don't care whether it calls die/goto/exit.

(It's probably not that easy. Feel free to tell me exactly why that won't work. :-) I haven't actually looked at any code.)

@ribasushi
Copy link
Author

What's the point of using exception_action for internal exceptions? Why should I as a user care about what exception class is used internally by DBIC if I never see those exceptions? I would've expected DBIC to use its own exceptions internally and only call exception_action to propagate errors to the user

This is a valid expectation and will be the case as of the next major release.

If D2::P::LR doesn't do that, then it's breaking the contract. So on the face of it, this seems like a (documentation?) bug in D2::P::LR.
...
None of these are "radically changing the language"; they're just what the mechanics of Perl allow.

The point of this ticket is that there are many many more ways for unexpected behavior to creep in, just as demonstrated here.

@abeverley
Copy link
Contributor

DBIC's exception_action must throw an exception. If D2::P::LR doesn't do that, then it's breaking the contract.

D2::P::LR is throwing an exception. One of the several things that are done when that exception is "actioned" (including logging to syslog etc) is to redirect the user to a "safe" page. The problem is that the redirect is then effectively goto-ing out of that exception block, so it never gets caught by DBIC.

I'm not saying that the plugin's behaviour is correct, nor that it can't be fixed. I'm sure it can, and I'll be looking at solutions to that. The point of this issue is that it has happened, it has caused some rather strange bugs, and has required a lot of debugging.

@abeverley
Copy link
Contributor

removing Return::MultiLevel or any other mechanism for jumping out of the calling subroutine will effectively break a lot of application. This is not likely to happen, merely by that clause, whether we have opinions in other ways or not.

We cannot just break a substantial amount of applications.

Indeed, the decision was made a couple of years ago. Whether it is a good idea or not is now largely academic.

Nonetheless, I do think some sort of warning system is needed to catch this happening again, as it surely will. Some sort of guard that will carp when it detects the situation. I would offer to write such a patch, but I do not currently have the required knowledge, but I think the issue should remain open until someone is able to.

Any other similar situations should also be caught and warned about, as it does worry me that I or others may have made similar assumptions elsewhere.

@kentfredric
Copy link

Dancer2's DSL is DSL (and is supposed to be more than just syntax, but a mini language onto its own), and ignoring the fact that Dancer2 actually uses its shiny (for different values of "shiny", I suppose) syntax to attract non-Perl developers (relatively successfully, IMHO),

This custom flow control system, "Special DSL for Dancer" or not, in practice means that Dancer cannot be expected to safely mix-and-match with arbitrary CPAN Libraries without first having to independently vet them for this specific issue first.

Given Dancer has a stated goal of bringing people to Perl, I see a significant conflict of interest by having a design that prevents non-Perl developers from using CPAN and Dancer together with expectations of safety.

I can appreciate the feature as-is being removed will cause headaches, but I'd rather we discouraged what is effectively a "goto $label hidden in a pretty API" from being further proliferated.

@xsawyerx
Copy link
Member

xsawyerx commented Feb 5, 2016

Considering changing this behavior will break user applications in very subtle and horrible ways, it is clear that it cannot simply be changed and possibly cannot be changed at all.

This means that this is not the forum to discuss whether this is effective outreach or whether we should generally discourage this practice. That does not change anything at hand and I don't want this ticket to become a place to debate this behavior, only how to solve the problem we have at hand.

I think @mauke gives interesting ideas that we should reflect on, understand, and possibly apply. That's what I would like to focus on.

@mauke
Copy link
Contributor

mauke commented Feb 5, 2016

Incidentally, "goto $label hidden in a pretty API" is exactly what I think exceptions are.

@shadowcat-mst
Copy link
Contributor

@mauke right, and Dancer 1 used exceptions for this rather than RML, and that was just a different sort of footgun.

The trick will be finding a way to warn the user where they're pointing it. I'll have a think about it.

@shadowcat-mst
Copy link
Contributor

(in 20/20 hindsight I think we should've called the immediate one redirect_immediate or redirect_now, but it totally didn't occur to me at the time so shrug)

@veryrusty
Copy link
Member

There is no reason we can't reverse @shadowcat-mst's suggestion and add a redirect_after keyword (or some other appropriate keyword. naming things is hard).

@veryrusty
Copy link
Member

@abeverley is there a public source repository for D2::P::LR ?

@markov2
Copy link

markov2 commented Feb 6, 2016

There are public sources, why would you like to have a public

repository?

greetz,
MarkOv


drs Mark A.C.J. Overmeer MARKOV Solutions
Mark@Overmeer.net solutions@overmeer.net
http://Mark.Overmeer.net http://solutions.overmeer.net

@veryrusty
Copy link
Member

@markov2 history. You can read a changelog and you can download several versions and diff them; but its 'metadata' like commit messages and references to issues (that are otherwise not in comments) that help to understand why somethings were done the way they were. I don't have any specific examples for Log::Report, but I do find git clone easier than "curl some.tarball + commit into a local repo" so as to leverage other tools.

@abeverley
Copy link
Contributor

@abeverley is there a public source repository for D2::P::LR ?

There's not I'm afraid Rusty. Do you have any specific questions I can help with?

@racke
Copy link
Member

racke commented Feb 6, 2016

Why not, @abeverley ? It makes a lot of sense for open source software.

@markov2
Copy link

markov2 commented Feb 6, 2016

In this case, that's not a choice made by Andy, but by me: at the moment,
the extension is part of the larger Log-Report distribution. During the
development, that was practical for us.

There is a major misconception about Open Source in relation to GitHub.
"Open Source" means that you have access to the sources you run, and that
you can contribute to it. Very welcome.

"GitHub" means that your development process is visible. When I develop,
I tend to break things for some time --on purpose. Between releases, I
take my freedom to change the interface (for instance method and parameter
names) I do not want anyone to look at it or try it until it is ready.

Release often with ChangeLog has proven to be a very good development

model for me. Totally Open Source.

Regards,
MarkOv


drs Mark A.C.J. Overmeer MARKOV Solutions
Mark@Overmeer.net solutions@overmeer.net
http://Mark.Overmeer.net http://solutions.overmeer.net

@xsawyerx
Copy link
Member

xsawyerx commented Feb 6, 2016

What "makes sense" for open source doesn't always fit what needs to be done in practice. Open Source can be developed openly or privately, and that's up to the person writing it or in charge of it.

@markov2
Copy link

markov2 commented Feb 6, 2016

  • Russell Jenkins (notifications@github.com) [160206 13:17]:

    @markov2 history. You can read a changelog and you can download several
    versions and diff them; but its 'metadata' like commit messages and
    references to issues (that are otherwise not in comments) that help to
    understand why somethings were done the way they were.

I use git(hub) for some projects, and do not use it for other projects.
That depends on the kind of contributions/cooperation in the code.

For complex libraries, I do not get any correct fix so why should I
give the impression that I accept pull-requests? It's much preferred
that people contact me by email about some problem they experience,
than getting a piece of code.

I don't have any specific examples for Log::Report, but I do find git clone easier than "curl some.tarball + commit into a local repo" so
as to leverage other tools.

For me, it is either git or backups to do my version control.

Regards,

           MarkOv

   Mark Overmeer MSc                                MARKOV Solutions
   Mark@Overmeer.net                          solutions@overmeer.net

http://Mark.Overmeer.net http://solutions.overmeer.net

@xsawyerx
Copy link
Member

Return::MultiValue is currently a core piece of the DSL. We cannot implement what is currently available with it if we remove the module. I hope (and expect) the people on this thread to not advocate for clear core DSL compatibility breakage. Because of this, I don't think just removing Return::MultiLevel will actually happen.

If anyone has an implementation which will work, or a way to implement the same thing, or a way to minimize the possible side-effects that caused this ticket to be raised, I would be more than happy. As it stands, just "remove this module" will not happen.

I vote for closing this issue.

@ribasushi
Copy link
Author

Given my non-involvement in Dancer2 I do not have anything further to add besides the initial investigation/diagnosis/recommendation.

I vote "whatever" ;)

@abeverley
Copy link
Contributor

or a way to minimize the possible side-effects that caused this ticket to be raised,

I would say that as a minimum, the documentation needs some warnings. Ideally, the code should also warn appropriately.

I vote for closing this issue.

I think the issue should remain open until the above has happened. I am happy to draft appropriate documentation updates when I get a moment.

dbsrgits-sync pushed a commit to Perl5/DBIx-Class that referenced this issue Mar 23, 2016
Pointed out by Lukas Mai in the last (fifth) bullet point of
PerlDancer/Dancer2#1125 (comment)

In addition add extra testing making sure that we will not inadvertently
silence $SIG{__DIE__} when the error is *not* transient

Hopefully this is the last piece of the "clean transient exceptions" puzzle,
it's already been way too much faffing: 7cb3585, ddcc02d and 5c33c8b :(
@xsawyerx
Copy link
Member

@abeverley and I sat to discuss this and decided that the plugin in question could use a different mechanism or document its current behavior.

There were two options and both were considered "consistent" for different considerations. It is up to the author to decide which consistency is desired and to document it appropriately. One was that anything in the web environment redirects (which cannot be controlled), even from within another plugin, or that only the Dancer2 plugin redirects. Both of these options are possible for the author to choose.

I don't agree that having a module return from an upper frame without you being able to trap it is necessarily incorrect[1], and that here there was, IMHO, a design decision on the plugin's side - to which a solution was found - I am closing this issue, to the content of both myself and the user (@abeverley).

[1] Sorry.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

10 participants