Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce avector crate at v0.0.0 #1399

Closed
wants to merge 50 commits into from
Closed

Introduce avector crate at v0.0.0 #1399

wants to merge 50 commits into from

Conversation

indietyp
Copy link
Member

@indietyp indietyp commented Nov 15, 2022

🌟 What is the purpose of this PR?

This PR introduces a new crate avector, a read optimized concurrently accessible (for reads) vector that can only be pushed to.

This crate:

  • will be used for hooks in deer
  • will probably be used for serde and defmt no-std compatible hooks for error-stack
  • may be used to enable Debug hooks on no-std for error-stack.

Regarding error-stack Debug: The "problem" is that append-vec only has read access for already pushed items, while hooks in error-stack can be removed and overwritten with a new hook. This can be solved by using arc-swap or similar or just ignoring already set hooks and "waste" space.

Reasoning

This crate originates from the needs of deer for a thoroughly tested and verified append only vector that is available on no-std systems. Previous crates like scc, appendlist, intrusive-collections, sento, or append-only-vec do not fit those requirements as they are only available on std, are not concurrently accessible or are poorly tested. This crate tries to remedy all those problems.

🔗 Related links

🚫 Blocked by

🔍 What does this change?

  • introduce a new crate, avector

📜 Does this require a change to the docs?

Documentation is still needed, as this is a completely new crate.

⚠️ Known issues

Currently append-vec has no std option and uses spinlock either way, one could remove the spinlock and replace it with a fully blown Mutex on std, tho benchmarks have shown that due to speed of push, impact due to spinning is small.

🐾 Next steps

  • Documentation (next PR)
  • Tests
  • Infrastructure (CI)
  • README (next PR)

The commits are cherry-picked from #1396 via git format-patch -k --stdout bm/deer/error-visitor...bm/deer/error-hooks -- append-vec | git am -3 -k
See https://stackoverflow.com/a/63745988/9077988 for more information.

📹 Demo

Coming Soon™

@github-actions github-actions bot added the area/libs > deer Affects the `deer` crate (library) label Nov 15, 2022
@github-actions github-actions bot added the area/infra Relates to version control, CI, CD or IaC (area) label Nov 15, 2022
packages/libs/append-vec/CHANGELOG.md Outdated Show resolved Hide resolved
packages/libs/append-vec/.gitignore Outdated Show resolved Hide resolved
packages/libs/append-vec/Cargo.toml Outdated Show resolved Hide resolved
packages/libs/append-vec/Makefile.toml Outdated Show resolved Hide resolved
packages/libs/append-vec/rust-toolchain.toml Outdated Show resolved Hide resolved
Co-authored-by: Tim Diekmann <21277928+TimDiekmann@users.noreply.github.com>
Co-authored-by: Tim Diekmann <21277928+TimDiekmann@users.noreply.github.com>
@indietyp indietyp requested a review from TimDiekmann November 16, 2022 22:00
@indietyp
Copy link
Member Author

indietyp commented Nov 16, 2022

note: coverage is failing because --all-targets includes benches, which should be excluded. This was also a problem with loom and miri, but there I was able to disable benches via cfg flags.

It turns out cargo-llvm-cov sets a coverage cfg flag, which I was able to exploit.

@@ -0,0 +1,4 @@
[toolchain]
# Please also update the badges in `README.md`, `src/lib.rs`, and `macros/` when changing this
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The readme is empty, we don't mention the toolchain in lib.rs, and we also don't have a macros folder.
I think it makes sense to add at least a small README, also containing the rust version.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed in 8cf1cf5 and a49ed41

@codecov
Copy link

codecov bot commented Nov 22, 2022

Codecov Report

Merging #1399 (6055f86) into main (6b438fe) will increase coverage by 4.88%.
The diff coverage is 98.49%.

@@            Coverage Diff             @@
##             main    #1399      +/-   ##
==========================================
+ Coverage   40.31%   45.19%   +4.88%     
==========================================
  Files         309      316       +7     
  Lines       16328    17447    +1119     
  Branches      813      813              
==========================================
+ Hits         6582     7885    +1303     
+ Misses       9741     9557     -184     
  Partials        5        5              
Flag Coverage Δ
avector 98.49% <98.49%> (?)
backend-integration-tests 3.20% <ø> (ø)
deer 67.80% <ø> (+67.14%) ⬆️
error-stack 90.44% <ø> (ø)
hash-graph 66.59% <ø> (+3.42%) ⬆️
unit-tests 1.17% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
packages/libs/avector/src/lib.rs 98.49% <98.49%> (ø)
packages/libs/deer/src/number.rs 0.00% <0.00%> (ø)
packages/libs/deer/src/error.rs
packages/libs/deer/src/error/value.rs 98.62% <0.00%> (ø)
packages/libs/deer/src/error/tuple.rs 80.00% <0.00%> (ø)
packages/libs/deer/src/error/location.rs 100.00% <0.00%> (ø)
packages/libs/deer/src/error/type.rs 99.10% <0.00%> (ø)
packages/libs/deer/src/error/extra.rs 98.88% <0.00%> (ø)
packages/libs/deer/src/error/mod.rs 60.71% <0.00%> (ø)
packages/libs/deer/src/error/unknown.rs 99.11% <0.00%> (ø)
... and 12 more

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

Copy link
Member

@TimDiekmann TimDiekmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not really sure about the implementation of the vector. I'm not sold on the idea to basically mix a linked list with a vector approach and wrap everything in a spin lockk. Why don't just use a spinlocked mutex and wrap a vector in it?

packages/libs/avector/src/lib.rs Show resolved Hide resolved
packages/libs/avector/src/lib.rs Show resolved Hide resolved
packages/libs/avector/src/lib.rs Show resolved Hide resolved
@@ -0,0 +1,551 @@
//! Concurrent Read Optimized Vector
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean with "read optimized"?
As I can see from the description below, you have a linked list with N elements per bucket. Read-optimized to me means, that both, index operations and iterating are fast, but none of them are if N is small (size_of<T> * N < cache line). For index operation you need to go through i / N pointers to index i mod N, for iterating you have to dereference length / N pointers. Compared to a vector, where N is infinite, this does not sound read-optimized.

Copy link
Member Author

@indietyp indietyp Nov 22, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the description isn't accurate; I will replace it. As you outlined, this type won't ever be as fast as Vec in read-only operations (this also depends on factors like: is an Arc used, a RwLock, or a Mutex). I mean that AVec is optimized for read access instead of write access. (AVec never re-allocates, and write has a lock, while read does not.)

packages/libs/avector/src/lib.rs Outdated Show resolved Hide resolved
}
}

// SAFETY: We use `UnsafeCell`, the referred `Bucket` is `Send` if `T` is `Send`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not a safety comment, which argues why it is safe to access the struct from two locations at the same time.

packages/libs/avector/src/lib.rs Outdated Show resolved Hide resolved
packages/libs/avector/src/lock.rs Outdated Show resolved Hide resolved
packages/libs/avector/src/lock.rs Outdated Show resolved Hide resolved
packages/libs/avector/src/lock.rs Outdated Show resolved Hide resolved
packages/libs/avector/README.md Outdated Show resolved Hide resolved
packages/libs/avector/README.md Outdated Show resolved Hide resolved
packages/libs/avector/README.md Outdated Show resolved Hide resolved
indietyp and others added 4 commits November 22, 2022 12:00
Co-authored-by: Tim Diekmann <21277928+TimDiekmann@users.noreply.github.com>
Co-authored-by: Tim Diekmann <21277928+TimDiekmann@users.noreply.github.com>
packages/libs/avector/README.md Outdated Show resolved Hide resolved
packages/libs/avector/src/lock.rs Outdated Show resolved Hide resolved
packages/libs/avector/src/lock.rs Outdated Show resolved Hide resolved
Comment on lines +1 to +4
//! Benchmark usecases against:
//! * (alloc) built-in Vec
//! * (std) intrusive-collections
//! * (std) im
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think these benchmarks are missing the most important operation, where this crate could shine: index read operations. The benchmarks assume you have a set of data to insert or iterate over. So globally, a mutex/RwLock is used to lock the data and do the operations as if it would be single-threaded. The larger the batch of data is, the less the impact of atomic operations will be.

start.wait();

let lock = slab.read().unwrap();
let items: Vec<_> = lock.iter().copied().collect();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you measure the difference by passing the actual data to black_box instead of a collected vector? I wonder if collect() has an impact on the benchmarks

@@ -0,0 +1,430 @@
//! Benchmark usecases against:
//! * (alloc) built-in Vec
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The crate claims to be efficient on read operations but the only operation, which has better performance in these benchmarks, are write-operations in a multi-threaded context. For all other operations, alloc::Vec is faster.

indietyp and others added 2 commits November 22, 2022 13:14
Co-authored-by: Tim Diekmann <21277928+TimDiekmann@users.noreply.github.com>
@indietyp
Copy link
Member Author

After talking with @TimDiekmann over DMs, we decided that the benefits of avector are not enough to outweigh the costs. Primarily benchmarks showed that AVec performed worse than RwLock<Vec> in most situations.

We might want to resume this in the future if there are any clear benefits.

For now, spin can be used instead. Should we find a use or benefit for avector in the future, we might want to reopen this PR.

On a personal note: While it is sad to see this PR closed, not everything was lost! This experiment and exploration was a great opportunity for learning and getting more comfortable with unsafe Rust and atomics.

@indietyp indietyp closed this Nov 22, 2022
@TimDiekmann
Copy link
Member

In the case we see benefits for avector in the future, it's probably worth also looking at this request for review.

@TimDiekmann TimDiekmann deleted the bm/av/init branch November 22, 2022 13:54
@vilkinsons
Copy link
Member

@TimDiekmann Fun old Stack question of yours! Might it be worth our x-linking to this closed PR at all from the original post? Not sure if that contravenes some rule, but it may provide useful inspiration for somebody in future.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/infra Relates to version control, CI, CD or IaC (area)
Development

Successfully merging this pull request may close these issues.

3 participants