-
Notifications
You must be signed in to change notification settings - Fork 83
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cannot wait for child xxx as it does not exist #218
Comments
FYI, this is a public GitHub fourm; I've removed your attached R code for you. |
Thank you. I hope you can reproduce the issue. When I have removed the %<-% it worked without warnings. |
Hi. Your code is very very long - you need to come up with a much smaller (= minimal) reproducible example and explain what type of troubleshooting you've attempted to try to narrow this issue down. You're the first one reporting these type of issues. |
Hi. I was afraid to change the code and loose the issue. I have tried with a simpler code, but I was unable to repeat it. As I remove some parts with %<-% from the original code, the number of warnings reduces. Warning messages: Please, don't worry about my case, because it seems to be running correctly despite the warnings. I just reported it because the unique change was the new R version, from yesterday. Thank you for your support. |
I see. Thanks for clarifying that those warnings seem harmless. I'll keep the issue open for a while in case other macOS users start to see these as well. |
I get the same warnings on macOS 10.13.4 and R 3.5.0 using the following test code: library(future)
plan(multiprocess)
testList <- vector(mode = "list", length = 10)
for (i in c(1:length(testList))) {
testList[[i]] <- future({i * 4})
}
testList <- resolve(testList)
testList <- values(testList) Once the loop completes there are 50 warnings. |
I don't have access to macOS, so I need your help to troubleshoot. From the warning details by @brainprint, the warning appears to come from the parallel package, so I believe this is independent of the future package. If you run the following in a fresh R session: jobs <- lapply(1:10, FUN = parallel::mcparallel)
values <- parallel::mccollect(jobs)
unlist(values) I'd expect that you'd also get those warnings - is that the case? |
Quick comment: the warnings on |
Interestingly the little test you posted above produces no errors or warnings when run (see attached output from R CMD BATCH). EDIT: Including output here /HB: R version 3.5.0 (2018-04-23) -- "Joy in Playing"
Copyright (C) 2018 The R Foundation for Statistical Computing
Platform: x86_64-apple-darwin17.5.0 (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
Natural language support but running in an English locale
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
> jobs <- lapply(1:10, FUN = parallel::mcparallel)
> values <- parallel::mccollect(jobs)
> unlist(values)
3805 3806 3807 3808 3809 3810 3811 3812 3813 3814
1 2 3 4 5 6 7 8 9 10
> warnings()
>
> proc.time()
user system elapsed
0.322 0.073 0.340 |
Thanks. It could be that it needs to be hit harder with more tasks, or it could be something I do in the future package. I'll add it to the list of things to investigate. |
Same here, no warnings. 12474 12475 12476 12477 12478 12479 12480 12481 12482 12483 |
And from @rps13 example, the summary is:
|
Thxs. Ah... not just macOS; by coincident I just stumbled upon this on a Linux cluster. Here's a minimal example that I can work with: > library("future")
> plan(multicore, workers = 2L)
> fs <- lapply(1:2L, FUN = future)
> values(fs)
[[1]]
[1] 1
[[2]]
[1] 2
Warning messages:
1: In selectChildren(job, timeout = timeout) : #<== produced by future::resolved.MulticoreFuture()
cannot wait for child 362508 as it does not exist
2: In selectChildren(job, timeout = timeout) : #<== produced by future::resolved.MulticoreFuture()
cannot wait for child 362506 as it does not exist
3: In selectChildren(pids[!fin], -1) : #<== produced by parallel::mccollect()
cannot wait for child 362508 as it does not exist
4: In selectChildren(pids[!fin], -1) : #<== produced by parallel::mccollect()
cannot wait for child 362508 as it does not exist This shouldn't happen, so I'll flag this as a bug (which probably been there before but only reveals itself in R (>= 3.5.0). |
@HenrikBengtsson using your example
And I agree with your assessment ("been there before"). Doubt: is the origin R 3.5.0 or future package? |
I'm leaning toward 'future' now - the simplest explanation would be that the future framework polls the workers one time to many also after the results have been already collected and the forked child process is gone. Just a guess for now - I'll try to find time to investigate and fix (or report upstream to R core if that's where the error is). As you've observed, these warnings are harmless - inspecting the R core code confirms that. |
Thanks for looking into this. Judging by the comments in the source of |
…w_bug.cgi?id=17413 Discovered while troubleshooting Issue #218
I'm suppressing these warnings for now since they are quite annoying, while acknowledging that the long-term solution is to fully understand what's going on so it can be fixed. I'm going to do a quick future 1.8.1 release, so the long-term fix will come in a later release. |
I know it is a closed issue, but since I was investigating it and have some findings I would like to share them. Please check the following snippet: # Define job factory
jobFactory <- function() {
parallel::mcparallel({
Sys.getpid()
})
}
# Example 1: trigger warning
job1 <- jobFactory()
parallel::mccollect(job1, wait = FALSE)
# No warnings
job2 <- jobFactory()
parallel::mccollect(job2, wait = FALSE)
# Warning message:
# In selectChildren(jobs, timeout) :
# cannot wait for child [pid of job1] as it does not exist
# Restart R session
rstudioapi::restartSession()
# Example 2: no warning, manual kill of processes
job1 <- jobFactory()
parallel::mccollect(job1, wait = FALSE)
parallel:::rmChild(job1)
# No warnings
job2 <- jobFactory()
parallel::mccollect(job2, wait = FALSE)
parallel:::rmChild(job2)
# No warnings
# Restart R session
rstudioapi::restartSession()
# Example 3: no warnings, call mccollect twice
job1 <- jobFactory()
job2 <- jobFactory()
parallel::mccollect(wait = FALSE)
# $`23428`
# [1] 23428
#
# $`23427`
# [1] 23427
parallel::mccollect(wait = FALSE)
# $`23428`
# NULL
#
# $`23427`
# NULL I think the warning in title is triggered by |
Thanks for this. I'm on a phone now so haven't tried to reproduce but these are useful findings. So, it looks independent of the future package and specific to R and the parallel package. We should report upstreams to get this fixed. Importantly, can you reproduce this outside of RStudio in a fresh R terminal session? If so, would you mind reporting this to the R-devel mailing list? Then the R core devels will see it. |
FYI, I can reproduce this in a pure R session on Linux; job1 <- parallel::mcparallel(Sys.getpid())
parallel::mccollect(job1, wait = FALSE)
job2 <- parallel::mcparallel(Sys.getpid())
### $`16223`
### [1] 16223
parallel::mccollect(job2, wait = FALSE)
### Warning in selectChildren(jobs, timeout) :
### cannot wait for child 16223 as it does not exist
### $`16247`
### [1] 16247 And now, in front a real screen (was on my phone before), I see that the purpose of your comment might have been to suggest that we should fix this in the future package by making sure to call also job1 <- parallel::mcparallel(Sys.getpid())
parallel::mccollect(job2, wait = FALSE)
### $`16441`
### [1] 16441
parallel:::rmChild(job1)
### [1] FALSE
job2 <- parallel::mcparallel(Sys.getpid())
parallel::mccollect(job2, wait = FALSE)
### $`16444`
### [1] 1444
parallel:::rmChild(job2)
### [1] TRUE I'll try to add this ... |
The following - "Fix uninitialized variable in a cleanup mark (parallel/fork)" - was just committed to R-devel /src/library/parallel/src/fork.c: index 3fe779474d..d2c6788b0f 100644
--- a/src/library/parallel/src/fork.c
+++ b/src/library/parallel/src/fork.c
[...]
@@ -288,6 +288,8 @@ SEXP mc_prepare_cleanup()
ci->waitedfor = 1;
ci->detached = 1;
ci->pid = -1; /* a cleanup mark */
+ ci->pfd = -1;
+ ci->sifd = -1; /* set fds to -1 to simplify close */
ci->ppid = getpid();
ci->next = children;
children = ci; Not sure, but it could be related to this issue. |
UPDATE: It looks like the underlying issue has been fixed R devel rev75467 - "Fix mc_select_children warning about non-existent children to wait for". The problem is still there in R 3.5.1 patched: $ R
R version 3.5.1 Patched (2018-10-20 r75479) -- "Feather Spray"
[...]
> job1 <- parallel::mcparallel(Sys.getpid())
> parallel::mccollect(job1, wait = FALSE)
$`287758`
[1] 287758
> job2 <- parallel::mcparallel(Sys.getpid())
> parallel::mccollect(job2, wait = FALSE)
$`288075`
[1] 288075
Warning message:
In selectChildren(jobs, timeout) :
cannot wait for child 287758 as it does not exist but is indeed fixed in R devel: $ R
R Under development (unstable) (2018-10-21 r75476) -- "Unsuffered Consequences"
[...]
> job1 <- parallel::mcparallel(Sys.getpid())
> parallel::mccollect(job1, wait = FALSE)
$`289242`
[1] 289242
> job2 <- parallel::mcparallel(Sys.getpid())
> parallel::mccollect(job2, wait = FALSE)
NULL
## wait a bit longer ...
> parallel::mccollect(job2, wait = FALSE)
$`328590`
[1] 328590 It's only if we call it again after already having collected the value that we get the warning: > parallel::mccollect(job2, wait = FALSE)
NULL
Warning message:
In selectChildren(jobs, timeout) :
cannot wait for child 328590 as it does not exist |
I can also confirm that future 1.8.0, which is the last version before the package suppress those warning manually, which produces the warning when running in R 3.5.1 patched (and before): > library(future); plan(multicore, workers = 2L); fs <- lapply(1:2, FUN = future); values(fs)
[[1]]
[1] 1
[[2]]
[1] 2
Warning messages:
1: In selectChildren(job, timeout = timeout) :
cannot wait for child 375577 as it does not exist
2: In selectChildren(job, timeout = timeout) :
cannot wait for child 375576 as it does not exist
3: In selectChildren(pids[!fin], -1) :
cannot wait for child 375577 as it does not exist
4: In selectChildren(pids[!fin], -1) :
cannot wait for child 375577 as it does not exist but not when running R-devel ("3.6.0"), e.g. > library(future); plan(multicore, workers = 2L); fs <- lapply(1:2, FUN = future); values(fs)
[[1]]
[1] 1
[[2]]
[1] 2 From this I conclude we can drop the |
…ed only in R 3.5.0 and R 3.5.1 which is where they occur [#218]
This has now also been fixed in R 3.5.1 patched, which means they will not appear in R 3.5.2 (if that is ever released). I can confirm that I don't see those warning using R version 3.5.1 Patched (2018-11-06 r75555) and future 1.8.0. I've updated the develop code to supress warnings only when running R 3.5.0 and R 3.5.1. I ignore older version of R 3.5.1 patched, so running the develop version of future there will produce those warnings. |
Using suppressWarnings because, when Rv3.5.1 is used, the package issues loads of warnings of the form "cannot wait for child xxx as it does not exist". For more info on this (known) issue, see, e.g., <HenrikBengtsson/future#218>.
Hi,
First of all, thank you for the package.
After updating to R 3.5.0, the same code, on the same machine (MAC - masOS Sierra), with no other changes, started providing warnings like: cannot wait for child 15641 as it does not exist. There is a Monte Carlo simulation and I am calculating the net present value and internal rate of return for each cashflow.
The amount of warnings are close to 50 no matter if for 100 or 10k simulations.
Best Regards, Rogério Normand.
PS: As the original code contains classified info, I messed up with the values/metrics/results
, but the logic remains intact. Please, keep to code confidential.
The text was updated successfully, but these errors were encountered: