Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

descendant processes of my termbox-go program hang #121

Closed
clementino opened this issue Feb 10, 2016 · 8 comments
Closed

descendant processes of my termbox-go program hang #121

clementino opened this issue Feb 10, 2016 · 8 comments

Comments

@clementino
Copy link

I apologize if this is not a termbox-go problem, but it seems to be related and I'd like to understand how that is possible.

I run linux, go1.5.3 and latest termbox-go.
My termbox-go program at some point does a termbox.Close() and spawn an interactive shell with exec.Command/Run. Using such a shell I noticed that some unrelated programs I use hang indefinitely if run from there, but otherwise work fine.

The hanging programs are all compiled with gcj from gcc-java-4.5.2, and gcj itself hangs too. I know it is quite an old version but I've never seen them hang if run from a different shell, how can they be influenced by my termbox-go program?

I've done hundreds of tests. For each shell instance run from my termbox-go program, gcj programs either always hang or they never do (about half of the times), and it generally applies to descendants too.
E.g. if [my-go-prog -> bash -> gcj-prog] hangs, then [my-go-prog -> bash(same instance) -> xterm -> bash -> xterm -> bash -> gcj-prog] hangs too, and vice versa.
But strangely if I add another instance of "my-go-prog" in the chain it may change things.
E.g. if [my-go-prog -> bash -> gcj-prog] hangs, then [my-go-prog -> bash(same instance) -> my-go-prog -> bash -> gcj-prog] may not hang anymore.

I was able to reproduce it with a minimal termbox-go program, but much less frequently, and it never occurred without termbox-go.

gcj strace ends with:

access("/usr/libexec/gcc/x86_64-slackware-linux/4.5.2/ecj1", X_OK) = 0
vfork()                                 = 3081
wait4(3081,

another program strace:

read(3, "9\0\30\0\0\0\0\0\0\0\0\0\244\201\376\315k\0gnu/javax/cryp"..., 4096) = 4096
tgkill(2170, 2171, SIGPWR)              = 0
futex(0x7ffbb2751d20, FUTEX_WAIT_PRIVATE, 0, NULL

They die with SIGKILL only.

I would appreciate any clue, thanks for the great library!

@nsf
Copy link
Owner

nsf commented Feb 10, 2016

Yes, closing termbox was always implemented not quite properly. And never well tested. Bugs like that are possible due to some weird reason, but I'm not quite sure why.

So, I don't know, I need somebody to investigate the issue more or do it myself. There are similar issues: #82

It's always a pain to dig those. Because not only you dig into details of how termbox operates on a particular OS (macosx can give completely different behaviour), but also Go and its perks. Go isn't the best language to launch child processes from.

@clementino
Copy link
Author

I was trying to understand the O_ASYNC/SIGIO issue when I found this comment:

Honestly there are issues which prevent multiple Init/Close calls within the same program anyway

That comment has been removed in 2012 but, if some known issues still remain (beside my problem here), can you be more specific about it? Because my program use multiple Init/Close all the time.

Otherwise instead of looking into this problem I would be better off by looking for a suitable alternative, or by moving the termbox-go part of my program in a coprocess and use rpc or something for communication, so that I can shut down the coprocess at termbox.Close() without losing state and make another one when I need it. But it would quite complicate things.

If someone with similar issues has an idea, please let me know!
Thanks.

@clementino
Copy link
Author

I narrowed this down and it is not a termbox-go issue.
I reproduced it with this program:

package main

import (
        "os"
        "os/exec"
        "os/signal"
        "syscall"
)

func main() {
        signal.Notify(make(chan os.Signal, 1), syscall.SIGUSR1)
        signal.Reset()
        cmd := exec.Command("bash")
        cmd.Stdin = os.Stdin
        cmd.Stdout = os.Stdout
        cmd.Stderr = os.Stderr
        cmd.Run()
}

Sometimes the subprocess is normal, and some other times it is set to ignore all this signals: 1-3, 6, 12-15, 17, 18, 20-26 and 28-64.
This makes some program hang.
Normally only signal 17, 18, 23 and 28 are ignored.
If you just import _ "os/signal" without calling a Notify it doesn't happen, but any Notify seems to make it happen, even if you Reset afterward.

I will investigate further.
Thanks.

@nsf
Copy link
Owner

nsf commented Feb 13, 2016

Well, I mentioned here this issue: #82

And that's what I said there:

Will definitely investigate. But I'm afraid it could be a Go problem and the way Go handles signals.

Because I am following Go development closely since the beginning and I know a bit of history behind Go signal handling. While their solution works at the end, it may cause problems like this. That was a gut feeling though, based on amount of times I had problems with signal handling during pre-1.0 Go era.

Tricky issue, what can you do about it.

@clementino
Copy link
Author

This issue seems to be fixed in go1.6rc2.
golang/go#13164

The go1.6rc2 version of my test program behaved correctly for thousands of iterations while with the go1.5.3 version the subprocess would start ignoring signals in one to five iterations.
Also in many manual tests with my termbox-go program I couldn't make the problem happen with the go1.6rc2 version while I can easily with the go1.5.3 version.

Thanks.

@nsf
Copy link
Owner

nsf commented Feb 14, 2016

I see, nice. Best case for me, I have to do nothing. Let's wait for Go 1.6 release and see then.

@arthurnn
Copy link

Was this fixed on 1.6 ? should we close it then?

@clementino
Copy link
Author

Sorry, I haven't logged into github ever since.
Yes, from go1.6 onward the problem was solved for me.
And I'm still using termbox-go a lot, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants