Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance: profvis(prof_input = large_file) #104

Open
wlandau opened this issue Jan 6, 2019 · 2 comments
Open

Performance: profvis(prof_input = large_file) #104

wlandau opened this issue Jan 6, 2019 · 2 comments

Comments

@wlandau
Copy link

wlandau commented Jan 6, 2019

I am profiling drake on a large test case. profvis(prof_output = "make_rprof_4096_64.Rprof") generates raw output just fine, but when I try profvis(prof_input = "make_rprof_4096_64.Rprof") in the RStudio IDE, the point-and-click responsiveness is prohibitively slow (on a mid-range Linux desktop computer).

Are there plans to enhance the scalability of profvis? When I convert the files to proto format for pprof and then run pprof -http=:8080 make_rprof_4096_64.proto, the interactive flame graph is super responsive.

# Profile the overhead incurred by drake on a large example.

n <- 4096
max_deps <- floor(sqrt(n))

# remotes::install_github("ropensci/drake")
library(drake)
library(fs)
library(profile)

# Let n be the number of targets.
# If max_deps is Inf, there are n * (n - 1) / 2 dependency connections
# among all the targets (maximum possible edges)
# For i = 2, ..., n, target i depends on targets 1 through i - 1.
create_plan <- function(n, max_deps = sqrt(n)) {
  plan <- drake_plan(target_1 = 1)
  for (i in seq_len(n - 1) + 1){
    target <- paste0("target_", i)
    dependencies <- paste0("target_", tail(seq_len(i - 1), max_deps))
    command <- paste0("max(", paste0(dependencies, collapse = ", "), ")")
    plan <- rbind(plan, data.frame(target = target, command = command))
  }
  plan
}

config_rprof <- function(n, max_deps = sqrt(n)) {
  paste0("config_rprof_", n, "_", max_deps, ".Rprof")
}

make_rprof <- function(n, max_deps = sqrt(n)) {
  paste0("make_rprof_", n, "_", max_deps, ".Rprof")
}

overhead <- function(n, max_deps = sqrt(n)) {
  plan <- create_plan(n = n, max_deps = max_deps)
  cache <- new_cache(tempfile())
  profvis::profvis(
    config <- drake_config(plan = plan, cache = cache, verbose = 0L),
    prof_output = config_rprof(n, max_deps)
  )
  profvis::profvis(
    make(config = config),
    prof_output = make_rprof(n, max_deps)
  )
}

overhead(n, max_deps) # Could take a long time

# Convert profiling results to pprof-friendly format.
for(path in c(config_rprof(n, max_deps), make_rprof(n, max_deps))) {
  proto <- path_ext_set(path, "proto")
  data <- read_rprof(path)
  write_pprof(data, proto)
}
@wch
Copy link
Member

wch commented Jan 10, 2019

I agree that the slowness is a problem for large amounts of profile data. When I looked at this a while back (by profiling the JavaScript code :) ), I found that the main bottleneck was D3 rendering lots of SVG objects.

We currently don't have a timeline for working on profvis's rendering performance. I suspect that to make a significant improvement, the rendering code would have to be completely revamped.

@wlandau
Copy link
Author

wlandau commented Jan 17, 2019

cc @thomasp85

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants