-
Notifications
You must be signed in to change notification settings - Fork 129
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add qs as a format option #1121
Comments
Implementation would be easy (perhaps too easy). But in the interest of controlling the number of formats we add so things do not get out of hand, I would first like to understand more about the behavior of My own benchmarks so far are not that impressive. Perhaps we need larger and more complicated data for library(microbenchmark)
library(qs)
#> qs v0.20.1: better serialization of S4 objects, see 'ChangeLog'
x <- 1
microbenchmark(
wb = writeBin(x, tempfile()),
rf = saveRDS(x, tempfile(), compress = FALSE),
qs = qsave(x, tempfile())
)
#> Unit: microseconds
#> expr min lq mean median uq max neval
#> wb 28.759 30.8775 32.60451 32.296 33.7700 47.332 100
#> rf 29.086 30.6370 33.01477 31.796 32.6530 116.532 100
#> qs 45.168 47.2715 53.92454 48.364 50.1735 541.282 100
x <- runif(1e8)
microbenchmark(
wb = writeBin(x, tempfile()),
rf = saveRDS(x, tempfile(), compress = FALSE),
qs = qsave(x, tempfile()),
times = 1
)
#> Unit: milliseconds
#> expr min lq mean median uq max neval
#> wb 623.1701 623.1701 623.1701 623.1701 623.1701 623.1701 1
#> rf 850.3506 850.3506 850.3506 850.3506 850.3506 850.3506 1
#> qs 1827.9655 1827.9655 1827.9655 1827.9655 1827.9655 1827.9655 1 |
I will note that part of the benefit of qs is the fast compression. I don't think Perhaps we could step back and consider an option that leverages |
Yeah, compression matters. In library(qs)
#> qs v0.20.1: better serialization of S4 objects, see 'ChangeLog'
library(pryr)
#> Registered S3 method overwritten by 'pryr':
#> method from
#> print.bytes Rcpp
x <- runif(1e7)
object_size(x)
#> 80 MB
rf <- tempfile()
rt <- tempfile()
ql <- tempfile()
qz <- tempfile()
system.time(saveRDS(x, rf, compress = FALSE))
#> user system elapsed
#> 0.101 0.032 0.132
system.time(saveRDS(x, rt, compress = TRUE))
#> user system elapsed
#> 10.290 0.008 10.339
system.time(qsave(x, ql, algorithm = "lz4"))
#> user system elapsed
#> 0.187 0.052 0.239
system.time(qsave(x, qz, algorithm = "zstd"))
#> user system elapsed
#> 0.193 0.040 0.233
file.size(rf) / 1e6
#> [1] 80.00003
file.size(rt) / 1e6
#> [1] 53.25956
file.size(ql) / 1e6
#> [1] 41.80436
file.size(qz) / 1e6
#> [1] 41.80436 Created on 2019-12-21 by the reprex package (v0.3.0) |
Prework
drake
's code of conduct.Proposal
This may or may not get implemented in
storr
as a backend for all files, but it may be worth just doing this directly in drake as an optionformat = "qs"
if it's easy enough?richfitz/storr#104
The text was updated successfully, but these errors were encountered: