-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support saving quasi-inverses on disk #29
Comments
I've been thinking about this for some time and here's my idea so far. Using zero-copy deserialization, it would be possible to deserialize
I was thinking we could implement zero-copy deserialization with the |
On Thu, May 20, 2021 at 11:13:28PM -0700, Joey Beauvais-Feisthauer wrote:
I was thinking we could implement zero-copy deserialization with the [`rkyv`](https://docs.rs/rkyv/0.6.4/rkyv/) crate. The hurdle so far is that mutexes aren't supported, but that's coming in version 0.7. Does that seem like a good way to proceed?
In my experience the largest problems is supporting Arc. The heavy
inter-dependence means the deserialization function needs to accept
custom arguments as "auxiliary data".
One option is to go even more low level, and write a custom mmap-backed
allocator. For our purposes this can simply be a bump allocator on a
mmapped region. We then need to modify all our objects to support custom
allocators. The problem with this is that the code will likely become
much harder to maintain, since this mmap option will result in different
types, which have to be chosen at compile time.
|
I don't think that should be a big issue. Since everything would be zero-copy the right data structures would already be in place before deserialization starts, and in any case we can write our own deserializer that uses an arbirtrary
Implementing a new allocator would be feasible but that's quite low level, and I'm not sure that would solve everything. If I understand correctly, allocation would have to depend on an mmap which could be either file-backed (with a dynamically known file handle) or anonymous. Then we would have to interact with it when the computation for a given t has finished, and I'm not even sure if Rust allows explicitly interacting with the allocator that way. Or the allocator would have to somehow figure out whether a quasi-inverse can be safely swapped out based on the other data that it has access to, and only when Rust happens to call it. |
Here's another thought. If |
@dalcde I think your latest PR solves this as well right? |
Currently, we store all quasi-inverses in memory, which takes up the bulk of the memory consumption (e.g. up to the 140th stem, we need 25.8 MiB to store the differentials and 40GiB for the quasi-inverses). If we want to resolve to larger degrees, we will run out of memory on any reasonable system.
Since these quasi-inverses are used at very predictable times, we ought to be able store these on the disk and call them on demand. Perhaps a custom allocator could be used for this purpose.
The text was updated successfully, but these errors were encountered: