-
Notifications
You must be signed in to change notification settings - Fork 129
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for disk.frame #1004
Comments
Yeah, let's definitely explore this. disk.frame looks really promising at a glance, I will need to try it out and learn more about it. As we go forward, we might also be able to use this approach to reduce the memory consumption of the split transformation, i.e. drake_plan(x = target(..., transform = split(...))). |
After reading the quick start guide and the article on ingesting data, and I think a library(nycflights13)
library(dplyr)
library(disk.frame)
library(data.table)
library(drake)
plan <- drake_plan(
flights.df = target(
as.disk.frame(
flights,
outdir = file.path(tempdir(), "tmp_flights.df"),
overwrite = TRUE
),
format = "disk.frame"
)
) But there are some big caveats:
Is all this reasonable? |
I probably need time to digest this. Apologies if I appear slow when I mull over this. Really appreciate your responsiveness here. |
Sure, we can take our time. Let me know if you need clarification. |
A benefit of |
If |
Would it work that when you run create a disk.frame inside a |
Seems like the most expedient approach we have. A couple thoughts:
|
By the way, do you have ideas on how we disable |
Possible option is to have a |
Automatic metaprogramming is achievable if the affected commands in the plan contain Personally, I am more concerned about clarity than automation. As long as people know how to use the To be clear, the only consequence of misspecifying |
Implemented in #1022. |
drake has recently added support for fst. It would be great to add disk.frame as a storage medium as well. E.g. make it possible to do the below.
The text was updated successfully, but these errors were encountered: