-
Notifications
You must be signed in to change notification settings - Fork 90
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use a persistent vector instead of an Rc<[..]>
#2057
Conversation
Bencher Report
Click to view all benchmark results
|
I'll see what it gives on the private benchmark |
I tried out imbl, but the performance is worse than rpds. |
The current version uses a custom re-implementation of persistent vectors, and it seems to be a performance win across the board. |
Rc<[..]>
@jneem Have you tried on the private bench? |
Yes, I forgot to mention that. Performance is basically identical on all three sizes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could be nice to add a description to the vector
crate.
I'm not too intimate with bitmapped vector tries, so I can't say I'm 100% sure that the whole implementation is flawless, but the general approach looks sane, the testing is also solid, and it's been thoroughly benchmarked.
core/src/transform/free_vars.rs
Outdated
let new_ts = ts | ||
.iter() | ||
.cloned() | ||
.map(|mut t| { | ||
t.collect_free_vars(free_vars); | ||
t | ||
}) | ||
.collect(); | ||
*ts = new_ts; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is my pet peeve, but I feel like this should be an imperative for, as we're just walking a structure and applying a mutation. Or is it that you just don't want to bother implementing iter_mut()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I went ahead and did mutable iteration. It has a bit more copy-paste from the other iterators, unfortunately, but maybe that will be more motivation to figure out a generic version...
//! [`Vector`] is a persistent vector (also known as a "bitmapped vector trie") | ||
//! with cheap clones and efficient copy-on-write modifications. [`Slice`] | ||
//! backs the implementation of arrays in Nickel. It's basically a [`Vector`] | ||
//! with support for slicing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have you followed a particular paper or source to implement them? Or took inspiration from another crate? If yes, it could be good to link it there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly just https://hypirion.com/musings/understanding-persistent-vector-pt-1, which I linked a little later. I looked at rpds
to see what choices they were making, but I didn't really imitate their implementation otherwise.
vector/src/vector.rs
Outdated
} | ||
} | ||
|
||
/// [`Vector`] is a persistent vector (also known as a "bitmapped vector trie"). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nitpick: I feel like this should be the module's documentation, and not Vector
.
By the way, I had another side question: could this representation take advantage of the in-place modification when a value is 1-RC ? I think the answer is yes, from what I remember reading Clojure's persistent array blog post, but just to make sure. |
Yep, that should be the case already. We use |
The last commit fails some test on Windows only it seems (but I think there is property-based testing, so the Windows part might be a red herring and it's just that some random path leading to the panic):
|
This is an experiment regarding array performance. Our current array representation (as essentially a
Rc<[RichTerm]>
) is problematic because it makes many common operations unnecessarily quadratic. This PR replaces it with arpds::Vector<RichTerm>
in reverse order, and the initial benchmarks look promising.In more detail, the current arrays have a few performance characteristics that we might like to keep:
The main problem is that there's no constant-time "cons" operation, and the concatenation operator
xs @ ys
isO(xs.len() + ys.len())
. This makes many functional-style list functions (like the stdlib implementations ofreverse
andfilter
) quadratic in the length of their input.The
Vector
implementation in therpds
crate is a "persistent vector" aka "bitmapped vector trie", which offers persistence/sharing, fast random access, and fast appends. We can do the same slicing trick that we're current using forRc<[RichTerm]>
to also add fast slicing. Thanks to fast appends, we can do concatenationxs @ ys
inO(ys.len())
time (provided that there are no contracts that need to be applied toxs
; I'm also ignoring logarithmic terms). This is backwards from the more common concatenation pattern in functional languages, so we store arrays backwards in order to get timeO(xs.len())
instead. (We could achieve the minimum of the two by storing an array as twoVectors
, a backwards one followed by a forwards one.)There are a few ways in which
rpds::Vector
isn't a perfect fit:Arc
orRc
, which we don't want becauseRichTerm
already has a shared pointerDespite these, this PR gives a 35% improvement in
random normal
, and a few other improvements between 10 and 20%. I'd like to also try benchmarkingim
and/orimbl
.