-
Notifications
You must be signed in to change notification settings - Fork 152
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
V4 #81
V4 #81
Conversation
20b0b5c
to
1ab79b3
Compare
Is |
|
Apologies if this is a known issue (it's a pre-release build, after all), but mentioning just in case it's not. I've been using
use dashmap::DashMap;
use std::time::Duration;
#[tokio::main]
async fn main() {
let size = 50;
let delay = Duration::from_millis(500);
let iter = (0..size).map(|i| (i, i * 2));
let map = DashMap::<usize, usize>::from_iter(iter);
let spawned = tokio::spawn(async move {
for i in 0..size {
tokio::time::delay_for(delay).await;
assert_eq!(*map.get(&i).unwrap().value(), i * 2);
println!("{}", i);
}
});
spawned.await.unwrap();
}
[package]
name = "dashmap-example"
version = "0.1.0"
edition = "2018"
[dependencies]
dashmap = "4.0.0-rc3"
tokio = { version = "0.2.21", features = ["rt-threaded", "macros", "time"] } On my machine (Mac with i5, 4 CPUs) it seems to deadlock about half the time within the first 5 iterations with |
I'm going to take a look but this is really odd due to the fact that get doesn't take any locks. Thanks for reporting. |
I can reproduce, going to try and find the issue. Seems to only happen in tokio though. |
This seems to be tokio using some sort of API and parks a thread while it is registering itself with the GC. Not really sure how to solve it. |
@walfie Thanks a ton for the bug report. It's super useful to have people testing this in pre-release so that this doesn't happen later. I've diagnosed the issue and released |
I wasn't expecting such a quick turnaround, thanks! I've confirmed that it also fixes the deadlock issue I was having in my real code (that the example was based on). |
For context this issue was not caught earlier because it would only occur if a resize happened before the first garbage collection cycle occurred. It was due to the fact that the GC would not yet have registered the runner thread with itself which locks a global mutex. The garbage collector would lock it during the collect cycle and then during execution of a deferred function containing code which would trigger the runner thread to try and register with GC, thus deadlocking. |
The fix was to simply force register the runner thread with the GC before starting the collect cycle. |
That makes sense -- it explains why I only saw the issue when I started initializing the map with more than a certain number of elements, and never saw it when the map started off empty. |
The map would deadlock if we initialized it with more than a certain number of items on startup (on my machine, more than 15 or 16 items). The latest RC fixes this issue: xacrimon/dashmap#81 (comment)
fc882e9
to
b7c3df1
Compare
Thanks for all the awesome work so far, @xacrimon! Will the |
It and a Java like compute_if_present will be exposed instead. |
Awesome, thanks for clarifying! That should still work for my own use cases. Your quick response is much appreciated. |
Hey! I'm currently using DashMap 4.0.0-rc6 and it is an absolute pleasure. However, I'm still quite concerned about how this is memory safe. I assume that you are atomic reference counting with the ElementGuard, which are backed by ABox, but I just wanted to be sure. Could you give a quick and dirty run down of the new implementation? |
Hey, thanks for using a prerelease. DashMap V4 is currently quite cutting edge stuff, even in academics. Quite a few parts are involved in making it safe to use and the design isn't exactly simple. There are a few core parts though. First, the public table state is just a bunch of atomicptrs containing a pointer to elements and some tag data used to speed things up dramatically. This means operations that change how the table is viewed just comes down to atomic cas operations. Entries are indeed single layer reference counted but this opens a problem. What if the reference count is decremented to 0 and an element dropped inbetween another thread reading the pointer and increment the refcount itself? To solve this we use an epoch based memory reclaimer to synchronize and defer destructive operations until a safe point in time. The exact details on how this works can be found in Keir Fraser's paper "Practical lock-freedom". |
Since probing is very expensive we use a few techniques to narrow down what keys we check. One of them is storing a partial hash in the pointers upper bits. If the partial hash values do not match, we don't have to incur a memory access and check the entry. There are some additional optimizations to make yet but those will come later. |
Hello, I was trying out the new This snippet compiles on "3.11.5" but fails on "4.0.0-rc6" with 3 issues: use dashmap::DashMap;
use std::ops::Deref;
type X<'a> = &'a ();
#[derive(Default, Debug)]
struct Foo<'a>(DashMap<u8, X<'a>>);
fn bar<'a>(foo: &'a Foo<'a>) -> impl Deref<Target = X<'a>> + 'a { foo.0.get(&3).unwrap() }
fn foo() {
let foo = Foo::default();
foo.0.insert(3, &());
println!("foo={:?}", &*bar(&foo));
} Issues 1 and 2 are that Issue 3 is that the function |
…vent garbage being queued somewhere invalid due to a delayed epoch increment
…ack to an acquire load when creating an iterator as that may happen from other threads
…ons in src/thread_local/table.rs
I'm very much looking forward to V4, but I understand it might be a while before it's ready for use. In particular, my application needs
|
|
Hi, I'm trying out // This an example from my game server where I need to iterate over all the players on a map and do an operation like removing the current character from there screen.
pub async fn clear(&self) -> Result<(), Error> {
debug!("Clearing Screen..");
let me = self.owner.character().await?;
for character in self.characters.iter() {
let observer = character.owner();
let observer_screen = observer.screen().await?;
observer_screen.remove_character(me.id()).await?;
self.remove_character(character.id()).await?;
}
self.characters.clear();
Ok(())
} and when I use this method in another async function, the compiler shows this error: error: future cannot be sent between threads safely
--> server\game\src\packets\msg_action.rs:143:34
|
143 | ) -> Result<(), Self::Error> {
| __________________________________^
144 | | let ty = self.action_type.into();
145 | | match ty {
146 | | ActionType::SendLocation => {
... |
287 | | Ok(())
288 | | }
| |_____^ future returned by `__process` is not `Send`
|
= help: the trait `std::marker::Send` is not implemented for `(dyn std::iter::Iterator<Item = dashmap::ElementGuard<u32, world::character::Character>> + 'static)`
note: future is not `Send` as this value is used across an await
--> server\game\src\systems\screen.rs:40:13
|
35 | for character in self.characters.iter() {
| ----------------------
| | |
| | `self.characters.iter()` is later dropped here
| has type `dashmap::Iter<u32, world::character::Character>` which is not `Send`
...
40 | self.remove_character(id).await?;
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ await occurs here, with `self.characters.iter()` maybe used later
= note: required for the cast to the object type `dyn std::future::Future<Output = std::result::Result<(), errors::Error>> + std::marker::Send` is there a workaround this? |
Unfortunately there is no workaround at the moment. Lockfree code typically heavily relies on bookkeeping on what each thread is referencing and moving an iterator across threads would break it. I have an idea for implementing migration of such bookkeeping state but it's a while away. |
Hi i am also trying
|
Dashmap v4 doesn't lock internally for flexibility reasons. If you need &mut access to values wrap them in your lock of choice. Regarding keys you'd probably be best off storing some state in the valie, using enums for different states if needed. |
@xacrimon Where, perhaps a branch or a tag or something else, is the current source code for v4? I've identified a memory leak in the latest published version and I'd like to help track it down. |
v4 is currently in limbo and pending a major rework. I do not recommend using the rc versions anywhere important. I'll rework it when I have time. |
Is the task list at the top up to date? |
Not very. V4 is pending a major rewrite at the moment which I have not had time for. |
Which branch are you currently working off of? |
This iteration and V4 featureset will currently be paused due to a lack of time and the need for a maintenance major release. |
OK, thanks for letting us know! |
V4 is a major rewrite of DashMap which switches the map from a locked sharded internal design to a mostly lockfree table. This has improved performance significantly, simplified the API, removed the deadlock bugs and reduced the amount of gotchas.
Remaining tasks
Iterators
Serde support
API guideline trait implementations
Documentation
cas / compute if present
upsert
insert_if_not_exists
drain
dashset
custom allocator support
Benchmarks
Blog post explaining the switch and the new internals