Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

notification like "Failed launch debugger for child process xxxx". #712

Open
liheyi360 opened this issue Aug 31, 2021 · 18 comments
Open

notification like "Failed launch debugger for child process xxxx". #712

liheyi360 opened this issue Aug 31, 2021 · 18 comments
Labels
bug Something isn't working
Milestone

Comments

@liheyi360
Copy link

liheyi360 commented Aug 31, 2021

I encounter the situation like #303 issue about multiple process debug, vscode pop up notification like "Failed launch debugger for child process xxxx".Sometimes, debugger can't acquire call stack info itself.
Screenshot from 2021-08-31 18-34-15
log.zip

import multiprocessing


def funcProcess(idx):
    return idx + 1

class myClass:
    def __init__(self):
        self.data = []

    def func(self):
        with multiprocessing.Pool() as myPool:
            self.data = myPool.map(funcProcess, range(1000))
        print(self.data)

tmp = myClass()
tmp.func()

Originally posted by @liheyi360 in #709 (comment)

@fabioz
Copy link
Collaborator

fabioz commented Aug 31, 2021

I still have to investigate this, but as a note, creating 1000 processes and subsequently 1000 debugs on VSCode may be stretching the limits (my guess is that the computer is getting too much to do and the timeouts end up reaching their limits).

Do you get this in a real-world use case?

@liheyi360
Copy link
Author

liheyi360 commented Aug 31, 2021

I use this code to do some heavy work. And I think function Pool can't make 1000 processes, process number is related to cpu core. Python multiprocess module really attemp to creating 1000 processes?

import multiprocessing as mp
from os import getpid

def func(idx):
     # record process id, just some file created
    obj = open('{}.log'.format(getpid()), 'w')
    obj.close()
    return idx+1

class myClass:
    def __init__(self):
        self.data = []

    def test(self):
        with mp.Pool(mp.cpu_count()) as mypool:
            ans = mypool.map(func, range(100))
        self.data = ans

a = myClass()
a.test()

@fabioz
Copy link
Collaborator

fabioz commented Aug 31, 2021

You're right, it should cap on your cpu count for simultaneous processes (then it'll start to reuse processes)... If you use less cores so that it doesn't stay at 100% cpu utilization for all cores, does it still have that issue for you (say use cpu_count()/2)?

@liheyi360
Copy link
Author

liheyi360 commented Aug 31, 2021

I follow your advice, notification still happen. This situation just happen while sub process end too qucikly.
Screenshot from 2021-08-31 21-07-22
log.zip

import multiprocessing as mp
from os import getpid

def initFunc():
    obj = open('{}.log'.format(getpid()), 'w')
    obj.close()

def func(idx):
    return idx+1

class myClass:
    def __init__(self):
        self.data = []

    def test(self):
        with mp.Pool(int(mp.cpu_count()/2), initFunc) as mypool:
        # show notification    
        ans = mypool.map(func, range(10000))
        # no notification
        # ans = mypool.map(func, range(100000000))
        self.data = ans

a = myClass()
a.test()

@int19h
Copy link
Contributor

int19h commented Aug 31, 2021

Does it make any difference if you do mp.set_start_method('spawn')?

@int19h
Copy link
Contributor

int19h commented Aug 31, 2021

The subprocesses should be paused until a client connects to them. But, for some reason, they do indeed exit early - in fact, most of them exit before the client even gets to request attach!

It's not clear to me why this is happening. The logs don't show anything unusual. Also, so far as I can tell, this repros on Linux (regardless of start method), but not on Windows.

@liheyi360
Copy link
Author

liheyi360 commented Sep 1, 2021

It's a misunderstanding.
I just set the breakpoint at the line where sub process has ended. Debugger can't launch non-existing process.
Code just work well, though notification is just a little annoying but needed.
Thank you for your help!

@int19h
Copy link
Contributor

int19h commented Sep 1, 2021

Can you clarify? It still looks like a bug to me in my local repros. When not debugging, everything works when expected. But when debugging, only a few processes spawn successfully; the rest exit before they even get to initFunc() (one can see that execution didn't get there by looking at those <pid>.log files the function creates). This is the weird part - when a subprocess is spawned, it should not do anything until some client connects to it. So it's unclear how or why it's exiting.

(My suspicion is that this has something to do with multiprocessing itself - maybe there's some timeout there? But I couldn't find anything obvious from a quick glance at the source.)

@liheyi360
Copy link
Author

The last comment is confusing and not verified, just forget it.
And I'm sorry I have no enough time and knowledge to discuss in depth.
Thank you for your help.

@fabioz fabioz added the bug Something isn't working label Sep 23, 2021
@fabioz
Copy link
Collaborator

fabioz commented Sep 23, 2021

I just investigated this... apparently, adding a bit of timeout before the pool is finished makes the error go away
i.e.: something as:

with multiprocessing.Pool() as myPool:
    self.data = myPool.map(funcProcess, range(1000))
    time.sleep(1)

Alternatively, just making more happen in the pool (say, change range(1000) to range(100000)) also makes the error go away.

So, what seems to be happening is:

  • The pool is created
  • Functions are scheduled to run on the pool
  • A few processes are created and start running the functions scheduled
  • When a process is started, a notification is sent to vscode to connect to it
  • The process/pool finishes and the processes are killed
  • VSCode didn't connect fast enough to debugpy for some of the processes which are now killed and complains about it (either with a Failed to launch debugger for child process: <pid> or with the Server disconnected unexpectedly message depending on to which point it got to into the connection process).

As a note, the execution seems to work fine (i.e.: the self.data is properly assigned with all expected values) when that message is shown, so, the main issue is just that those messages are being shown because the connection to the client wasn't completed before the process was killed.

Given that this is in the connection management layer (vscode <-> debugpy), would you like to take a look at that @int19h?

@int19h
Copy link
Contributor

int19h commented Sep 23, 2021

Thing is, child processes aren't supposed to start running anything until there's a debugger connection to them that has gone through the entire initialization stage (otherwise breakpoints might be skipped etc). So either the pool killing processes due to some kind of timeout, before they even had a chance to run anything; or there's some bug in how subprocesses are resumed.

@fabioz
Copy link
Collaborator

fabioz commented Sep 23, 2021

Thing is, child processes aren't supposed to start running anything until there's a debugger connection to them that has gone through the entire initialization stage (otherwise breakpoints might be skipped etc)

We're on the same page there.

So either the pool killing processes due to some kind of timeout, before they even had a chance to run anything

Exactly, but not due to some timeout and as its regular operation when everything scheduled to run already finished.

i.e.: The issue is that the multiprocessing pool would start 8 processes, starts 4 and starts sending the methods to be executed while the other ones are still being initialized, then, all the methods end up finishing in those first 4 processes and the remaining 4 which weren't ready to run are killed when the multiprocessing pool context manager exits in that example (while the debugger is still in the connection phase for those processes).

@judej judej modified the milestones: Dev 17.x, Dev 17.1 Oct 20, 2021
@ShorelLee
Copy link

I also encountered this problem, but my problem is, I hit a breakpoint in the program, but the program has returned before the breakpoint, so this problem occurs.

@david-waterworth
Copy link

I just encountered this as well, 3/4 of the way through a machine learning task vscode started displaying message boxes "Failed to launch debugger for child process" and the main script terminated with no exceptions printed to the console.

@int19h
Copy link
Contributor

int19h commented Mar 8, 2022

Note that the in the original issue, the computation itself completes successfully with the expected result - the error messages are essentially spurious (they're technically correct, just irrelevant).

If you're actually seeing different results with and without debugger, can you please file a separate issue?

@tierriminator
Copy link

I am having the same issue and for me it is very invasive since the Server disconnected unexpectedly message needs to be actively cancelled. I am using Pool.map from multiprocessing rather frequently, which makes it effectively impossible for me to debug anything that isn't right at the beginning of my program.

@BraveDrXuTF
Copy link

Similar problem, mine is "invalid message session is already started"

@BraveDrXuTF
Copy link

Similar problem, mine is "invalid message session is already started"

How to fix it?

@judej judej assigned debonte and unassigned debonte Oct 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

9 participants