Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Survival censoring weights #897

Merged
merged 20 commits into from
Mar 15, 2023
Merged

Survival censoring weights #897

merged 20 commits into from
Mar 15, 2023

Conversation

topepo
Copy link
Member

@topepo topepo commented Mar 1, 2023

Computations based on Graf (1999).

It:

  • computes the time when the censoring probabilities should be evaluated
  • calculates the Graf weights
  • uses a small convenience function to better test when tensorflow was installed.

(That last one is out of scope but I was getting massive failures with newer versions of tensorflow. )

If you want some code to play around with:

# pak::pak(c("tidymodels/yardstick#347"), ask = FALSE)
library(tidymodels)
library(censored)

# ------------------------------------------------------------------------------

tidymodels_prefer()
theme_set(theme_bw())
options(pillar.advice = FALSE, pillar.min_title_chars = Inf)

# ------------------------------------------------------------------------------

lungs <- lung %>% select(time, status, age, ph.ecog)
lungs <- lungs[complete.cases(lungs), ]
lungs$surv <- Surv(lungs$time, lungs$status)
lungs <- lungs %>% select(-time, -status)


set.seed(7181)
split <- initial_split(lungs)
lung_tr <- training(split)
lung_te <- testing(split)

# ------------------------------------------------------------------------------

pred_times <- (1:100) * 4

set.seed(6589)
fit <-
  survival_reg() %>%
  fit(surv ~ ., data = lung_tr)

test_pred <-
  predict(fit, lung_te, type = "survival", time = pred_times) %>%
  bind_cols(lung_te %>% select(surv)) %>%
  add_rowindex()

# ------------------------------------------------------------------------------

test_pred_with_weights <-
  unnest(test_pred, cols = .pred) %>%
  rename(eval_time = .time) %>%
  full_join(
    .censoring_weights_graf(fit, test_pred, eval_time = pred_times),
    by = c(".row", "eval_time")
  ) %>%
  group_by(eval_time)

brier_survival(test_pred_with_weights, surv, .pred_survival, .weight_cens, eval_time) %>%
  ggplot(aes(eval_time, .estimate)) +
  geom_point() +
  geom_hline(yintercept = (1 - 1/2)^2, col = "red", alpha = 1/2)

@topepo topepo marked this pull request as ready for review March 1, 2023 23:40
Copy link
Member

@hfrick hfrick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One minor correction in the docs about the definition of the Graf categories, otherwise just suggestions 👍

R/ipcw.R Outdated Show resolved Hide resolved
R/ipcw.R Outdated Show resolved Hide resolved
R/ipcw.R Outdated Show resolved Hide resolved
R/ipcw.R Outdated Show resolved Hide resolved
R/ipcw.R Outdated Show resolved Hide resolved
tests/testthat/test-ipcw.R Outdated Show resolved Hide resolved
R/ipcw.R Outdated Show resolved Hide resolved
R/ipcw.R Outdated Show resolved Hide resolved
DESCRIPTION Outdated Show resolved Hide resolved
@EmilHvitfeldt
Copy link
Member

EmilHvitfeldt commented Mar 13, 2023

I'm not able to run the above code. I am however able to R-CMD-CHECK the branch with no problem.

# pak::pak(c("tidymodels/yardstick#347"), ask = FALSE)
library(tidymodels)
library(censored)
#> Loading required package: survival

lungs <- lung %>% select(time, status, age, ph.ecog)
lungs <- lungs[complete.cases(lungs), ]
lungs$surv <- Surv(lungs$time, lungs$status)
lungs <- lungs %>% select(-time, -status)

set.seed(7181)
split <- initial_split(lungs)
lung_tr <- training(split)
lung_te <- testing(split)
pred_times <- (1:100) * 4

set.seed(6589)
fit <-
  survival_reg() %>%
  fit(surv ~ ., data = lung_tr)

test_pred <-
  predict(fit, lung_te, type = "survival", time = pred_times) %>%
  bind_cols(lung_te %>% select(surv)) %>%
  add_rowindex()
.censoring_weights_graf(fit, test_pred, eval_time = pred_times)
#> Error in `dplyr::mutate()`:
#> ℹ In argument: `.prob_cens = predict(object$censor_probs, time =
#>   weight_time, as_vector = TRUE)`.
#> Caused by error in `purrr::map_dbl()` at parsnip/R/censoring_probs.R:77:4:
#> ℹ In index: 1.
#> Caused by error in `UseMethod()`:
#> ! no applicable method for 'predict' applied to an object of class "try-error"

#> Backtrace:
#>      ▆
#>   1. ├─parsnip::.censoring_weights_graf(fit, test_pred, eval_time = pred_times)
#>   2. ├─parsnip:::.censoring_weights_graf.model_fit(fit, test_pred, eval_time = pred_times) at parsnip/R/ipcw.R:164:2
#>   3. │ └─... %>% ... at parsnip/R/ipcw.R:210:2
#>   4. ├─dplyr::select(., .row, eval_time, .prob_cens, .weight_cens)
#>   5. ├─dplyr::mutate(...)
#>   6. ├─dplyr:::mutate.data.frame(...)
#>   7. │ └─dplyr:::mutate_cols(.data, dplyr_quosures(...), by)
#>   8. │   ├─base::withCallingHandlers(...)
#>   9. │   └─dplyr:::mutate_col(dots[[i]], data, mask, new_columns)
#>  10. │     └─mask$eval_all_mutate(quo)
#>  11. │       └─dplyr (local) eval()
#>  12. ├─stats::predict(object$censor_probs, time = weight_time, as_vector = TRUE)
#>  13. ├─parsnip:::predict.censoring_model_reverse_km(...)
#>  14. │ └─purrr::map_dbl(time, ~predict(object$fit, times = .x, type = "surv")) at parsnip/R/censoring_probs.R:77:4
#>  15. │   └─purrr:::map_("double", .x, .f, ..., .progress = .progress)
#>  16. │     ├─purrr:::with_indexed_errors(...)
#>  17. │     │ └─base::withCallingHandlers(...)
#>  18. │     ├─purrr:::call_with_cleanup(...)
#>  19. │     └─parsnip (local) .f(.x[[i]], ...)
#>  20. │       └─stats::predict(object$fit, times = .x, type = "surv")
#>  21. └─base::.handleSimpleError(...)
#>  22.   └─purrr (local) h(simpleError(msg, call))
#>  23.     └─cli::cli_abort(...)
#>  24.       └─rlang::abort(...)

fit$censor_probs
#> $formula
#> surv ~ .
#> <environment: 0x11af31568>
#> 
#> $fit
#> [1] "Error in EventHistory.frame(formula = formula, data = data, unspecialsDesign = FALSE,  : \n  formula is NOT a proper survival formula,\nwhich must have a `Surv' or `Hist' object as response.\n"
#> attr(,"class")
#> [1] "try-error"
#> attr(,"condition")
#> <simpleError in EventHistory.frame(formula = formula, data = data, unspecialsDesign = FALSE,     specials = c("Strata", "strata", "factor", "NN", "cluster"),     stripSpecials = c("strata", "cluster", "NN"), stripAlias = list(strata = c("Strata",         "factor")), stripArguments = list(strata = NULL, NN = NULL,         cluster = NULL), specialsDesign = FALSE, check.formula = TRUE): formula is NOT a proper survival formula,
#> which must have a `Surv' or `Hist' object as response.>
#> 
#> $label
#> [1] "reverse_km"
#> 
#> $required_pkgs
#> [1] "prodlim"
#> 
#> attr(,"class")
#> [1] "censoring_model_reverse_km" "censoring_model"

sessioninfo::session_info()
#> ─ Session info ───────────────────────────────────────────────────────────────
#>  setting  value
#>  version  R version 4.2.1 (2022-06-23)
#>  os       macOS Monterey 12.6
#>  system   aarch64, darwin20
#>  ui       X11
#>  language (EN)
#>  collate  en_US.UTF-8
#>  ctype    en_US.UTF-8
#>  tz       America/Los_Angeles
#>  date     2023-03-12
#>  pandoc   2.19.2 @ /Applications/RStudio.app/Contents/Resources/app/quarto/bin/tools/ (via rmarkdown)
#> 
#> ─ Packages ───────────────────────────────────────────────────────────────────
#>  package      * version    date (UTC) lib source
#>  backports      1.4.1      2021-12-13 [2] CRAN (R 4.2.0)
#>  broom        * 1.0.3      2023-01-25 [1] CRAN (R 4.2.0)
#>  censored     * 0.1.1.9001 2023-03-13 [1] Github (tidymodels/censored@f8d2dc1)
#>  class          7.3-20     2022-01-16 [2] CRAN (R 4.2.1)
#>  cli            3.6.0      2023-01-09 [1] CRAN (R 4.2.0)
#>  codetools      0.2-18     2020-11-04 [2] CRAN (R 4.2.1)
#>  colorspace     2.1-0      2023-01-23 [1] CRAN (R 4.2.0)
#>  dials        * 1.1.0.9000 2023-03-09 [1] Github (tidymodels/dials@ec3cd51)
#>  DiceDesign     1.9        2021-02-13 [1] CRAN (R 4.2.0)
#>  digest         0.6.31     2022-12-11 [1] CRAN (R 4.2.0)
#>  dplyr        * 1.1.0      2023-01-29 [1] CRAN (R 4.2.0)
#>  evaluate       0.20       2023-01-17 [1] CRAN (R 4.2.0)
#>  fansi          1.0.4      2023-01-22 [1] CRAN (R 4.2.0)
#>  fastmap        1.1.1      2023-02-24 [1] CRAN (R 4.2.0)
#>  foreach        1.5.2      2022-02-02 [1] CRAN (R 4.2.0)
#>  fs             1.6.1      2023-02-06 [1] CRAN (R 4.2.0)
#>  furrr          0.3.1      2022-08-15 [1] CRAN (R 4.2.0)
#>  future         1.32.0     2023-03-07 [1] CRAN (R 4.2.0)
#>  future.apply   1.10.0     2022-11-05 [1] CRAN (R 4.2.1)
#>  generics       0.1.3      2022-07-05 [1] CRAN (R 4.2.0)
#>  ggplot2      * 3.4.1      2023-02-10 [1] CRAN (R 4.2.0)
#>  globals        0.16.2     2022-11-21 [1] CRAN (R 4.2.0)
#>  glue           1.6.2      2022-02-24 [1] CRAN (R 4.2.0)
#>  gower          1.0.1      2022-12-22 [1] CRAN (R 4.2.0)
#>  GPfit          1.0-8      2019-02-08 [1] CRAN (R 4.2.0)
#>  gtable         0.3.1      2022-09-01 [1] CRAN (R 4.2.0)
#>  hardhat        1.2.0.9000 2023-02-06 [1] Github (tidymodels/hardhat@c2c896c)
#>  htmltools      0.5.4      2022-12-07 [1] CRAN (R 4.2.0)
#>  infer        * 1.0.4      2022-12-02 [1] CRAN (R 4.2.1)
#>  ipred          0.9-14     2023-03-09 [1] CRAN (R 4.2.1)
#>  iterators      1.0.14     2022-02-05 [1] CRAN (R 4.2.0)
#>  knitr          1.42       2023-01-25 [1] CRAN (R 4.2.0)
#>  lattice        0.20-45    2021-09-22 [2] CRAN (R 4.2.1)
#>  lava           1.7.2.1    2023-02-27 [1] CRAN (R 4.2.0)
#>  lhs            1.1.6      2022-12-17 [1] CRAN (R 4.2.0)
#>  lifecycle      1.0.3      2022-10-07 [1] CRAN (R 4.2.0)
#>  listenv        0.9.0      2022-12-16 [1] CRAN (R 4.2.0)
#>  lubridate      1.9.2      2023-02-10 [1] CRAN (R 4.2.0)
#>  magrittr       2.0.3      2022-03-30 [1] CRAN (R 4.2.0)
#>  MASS           7.3-57     2022-04-22 [2] CRAN (R 4.2.1)
#>  Matrix         1.5-3      2022-11-11 [1] CRAN (R 4.2.0)
#>  modeldata    * 1.1.0.9000 2023-03-08 [1] local
#>  munsell        0.5.0      2018-06-12 [1] CRAN (R 4.2.0)
#>  nnet           7.3-17     2022-01-16 [2] CRAN (R 4.2.1)
#>  parallelly     1.34.0     2023-01-13 [1] CRAN (R 4.2.0)
#>  parsnip      * 1.0.4.9002 2023-03-13 [1] local
#>  pillar         1.8.1      2022-08-19 [1] CRAN (R 4.2.0)
#>  pkgconfig      2.0.3      2019-09-22 [1] CRAN (R 4.2.0)
#>  prodlim        2019.11.13 2019-11-17 [1] CRAN (R 4.2.0)
#>  purrr        * 1.0.1      2023-01-10 [1] CRAN (R 4.2.0)
#>  R.cache        0.16.0     2022-07-21 [2] CRAN (R 4.2.0)
#>  R.methodsS3    1.8.2      2022-06-13 [2] CRAN (R 4.2.0)
#>  R.oo           1.25.0     2022-06-12 [2] CRAN (R 4.2.0)
#>  R.utils        2.12.2     2022-11-11 [1] CRAN (R 4.2.0)
#>  R6             2.5.1      2021-08-19 [1] CRAN (R 4.2.0)
#>  Rcpp           1.0.10     2023-01-22 [1] CRAN (R 4.2.0)
#>  recipes      * 1.0.5.9000 2023-03-08 [1] local
#>  reprex         2.0.2      2022-08-17 [1] CRAN (R 4.2.0)
#>  rlang          1.0.6      2022-09-24 [1] CRAN (R 4.2.0)
#>  rmarkdown      2.20       2023-01-19 [1] CRAN (R 4.2.0)
#>  rpart          4.1.16     2022-01-24 [2] CRAN (R 4.2.1)
#>  rsample      * 1.1.1      2022-12-07 [1] CRAN (R 4.2.0)
#>  rstudioapi     0.14       2022-08-22 [1] CRAN (R 4.2.0)
#>  scales       * 1.2.1      2022-08-20 [1] CRAN (R 4.2.0)
#>  sessioninfo    1.2.2      2021-12-06 [2] CRAN (R 4.2.0)
#>  styler         1.9.0      2023-01-15 [1] CRAN (R 4.2.0)
#>  survival     * 3.5-3      2023-02-12 [1] CRAN (R 4.2.0)
#>  tibble       * 3.2.0      2023-03-08 [1] CRAN (R 4.2.1)
#>  tidymodels   * 1.0.0.9000 2023-03-06 [1] local
#>  tidyr        * 1.3.0      2023-01-24 [1] CRAN (R 4.2.0)
#>  tidyselect     1.2.0      2022-10-10 [1] CRAN (R 4.2.0)
#>  timechange     0.2.0      2023-01-11 [1] CRAN (R 4.2.0)
#>  timeDate       4022.108   2023-01-07 [1] CRAN (R 4.2.1)
#>  tune         * 1.0.1.9003 2023-03-02 [1] local
#>  utf8           1.2.3      2023-01-31 [1] CRAN (R 4.2.0)
#>  vctrs          0.5.2      2023-01-23 [1] CRAN (R 4.2.0)
#>  withr          2.5.0      2022-03-03 [1] CRAN (R 4.2.0)
#>  workflows    * 1.1.3      2023-02-22 [1] CRAN (R 4.2.0)
#>  workflowsets * 1.0.0      2022-07-12 [1] CRAN (R 4.2.0)
#>  xfun           0.37       2023-01-31 [1] CRAN (R 4.2.0)
#>  yaml           2.3.7      2023-01-23 [1] CRAN (R 4.2.0)
#>  yardstick    * 1.1.0.9000 2023-03-13 [1] Github (tidymodels/yardstick@c8049ca)
#> 
#>  [1] /Users/emilhvitfeldt/Library/R/arm64/4.2/library
#>  [2] /Library/Frameworks/R.framework/Versions/4.2-arm64/Resources/library
#> 
#> ──────────────────────────────────────────────────────────────────────────────

Created on 2023-03-12 with reprex v2.0.2

@EmilHvitfeldt
Copy link
Member

filter_eval_time() needs more guarding here.

if (fail && identical(eval_time, numeric(0))) {

as it stops zero lengths doubles but not zero length integers

.filter_eval_time(integer(0))
#> integer(0)
.filter_eval_time(numeric(0))
#> Error:
#> ! There were no usable evaluation times (finite, non-missing, and >= 0).

#> Backtrace:
#>     ▆
#>  1. └─global .filter_eval_time(numeric(0))
#>  2.   └─rlang::abort(...)

R/ipcw.R Outdated Show resolved Hide resolved
R/ipcw.R Outdated Show resolved Hide resolved
R/ipcw.R Outdated Show resolved Hide resolved
R/ipcw.R Outdated Show resolved Hide resolved
R/ipcw.R Outdated Show resolved Hide resolved
Copy link
Member

@EmilHvitfeldt EmilHvitfeldt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good! A couple suggestions and thoughts

hfrick added a commit that referenced this pull request Mar 14, 2023
* check type for all modes

* bump version in anticipation of #897 being merged first

* update NEWS

* update to merge PR
topepo and others added 3 commits March 15, 2023 07:53
Co-authored-by: Hannah Frick <hfrick@users.noreply.github.com>
Co-authored-by: Emil Hvitfeldt <emilhhvitfeldt@gmail.com>
@topepo topepo merged commit fa741d9 into main Mar 15, 2023
@topepo topepo deleted the survival-censoring-weights branch March 15, 2023 15:47
@github-actions
Copy link

This pull request has been automatically locked. If you believe you have found a related problem, please file a new issue (with a reprex: https://reprex.tidyverse.org) and link to this issue.

@github-actions github-actions bot locked and limited conversation to collaborators Mar 30, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants