Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Merged by Bors] - Fix AssetServer::get_asset_loader deadlock #2395

Closed
wants to merge 1 commit into from

Conversation

vgel
Copy link
Contributor

@vgel vgel commented Jun 25, 2021

Objective

Fixes a possible deadlock between AssetServer::get_asset_loader / AssetServer::add_loader

A thread could take the extension_to_loader_index read lock,
and then have the server.loader write lock taken in add_loader
before it can. Then add_loader can't take the extension_to_loader_index
lock, and the program deadlocks.

To be more precise:

Step 1: Thread 1 grabs the extension_to_loader_index lock on lines 138..139:

fn get_asset_loader(
&self,
extension: &str,
) -> Result<Arc<Box<dyn AssetLoader>>, AssetServerError> {
self.server
.extension_to_loader_index
.read()
.get(extension)
.map(|index| self.server.loaders.read()[*index].clone())
.ok_or_else(|| AssetServerError::MissingAssetLoader {
extensions: vec![extension.to_string()],
})
}

Step 2: Thread 2 grabs the server.loader write lock on line 107:

pub fn add_loader<T>(&self, loader: T)
where
T: AssetLoader,
{
let mut loaders = self.server.loaders.write();
let loader_index = loaders.len();
for extension in loader.extensions().iter() {
self.server
.extension_to_loader_index
.write()
.insert(extension.to_string(), loader_index);
}
loaders.push(Arc::new(Box::new(loader)));
}

Step 3: Deadlock, since Thread 1 wants to grab server.loader on line 141...:

fn get_asset_loader(
&self,
extension: &str,
) -> Result<Arc<Box<dyn AssetLoader>>, AssetServerError> {
self.server
.extension_to_loader_index
.read()
.get(extension)
.map(|index| self.server.loaders.read()[*index].clone())
.ok_or_else(|| AssetServerError::MissingAssetLoader {
extensions: vec![extension.to_string()],
})
}

... and Thread 2 wants to grab 'extension_to_loader_index` on lines 111..112:

pub fn add_loader<T>(&self, loader: T)
where
T: AssetLoader,
{
let mut loaders = self.server.loaders.write();
let loader_index = loaders.len();
for extension in loader.extensions().iter() {
self.server
.extension_to_loader_index
.write()
.insert(extension.to_string(), loader_index);
}
loaders.push(Arc::new(Box::new(loader)));
}

Solution

Fixed by descoping the extension_to_loader_index lock, since
get_asset_loader doesn't need to hold the read lock on the extensions map for the duration,
just to get a copyable usize. The block might not be needed,
I think I could have gotten away with just inserting a copied()
call into the chain, but I wanted to make the reasoning clear for
future maintainers.

A thread could take the extension_to_loader_index read lock,
and then have the `server.loader` write lock taken in add_loader
before it can. Then add_loader can't take the extension_to_loader_index
lock, and the program deadlocks.

Fixed by descoping the extension_to_loader_index lock, since
get_asset_loader doesn't need to hold the lock for the duration,
just to get a copiable usize. The block might not be needed,
I think I could have gotten away with just inserting a `copied()`
call into the chain, but I wanted to make the reasoning clear for
future maintainers.
@github-actions github-actions bot added the S-Needs-Triage This issue needs to be labelled label Jun 25, 2021
@NathanSWard NathanSWard added A-Assets Load files from disk to use for things like images, models, and sounds C-Bug An unexpected or incorrect behavior and removed S-Needs-Triage This issue needs to be labelled labels Jun 25, 2021
Copy link
Contributor

@NathanSWard NathanSWard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a great catch!! Nice job 😄
This issue also probably hints at us needed to run some sort of better testing for parallel code.
E.g. some sort of fuzz testing or thread-sanitizer.

crates/bevy_asset/src/asset_server.rs Show resolved Hide resolved
@mockersf
Copy link
Member

bors try

bors bot added a commit that referenced this pull request Jun 25, 2021
@NathanSWard NathanSWard added the S-Ready-For-Final-Review This PR has been approved by the community. It's ready for a maintainer to consider merging it label Jun 26, 2021
@bjorn3
Copy link
Contributor

bjorn3 commented Jun 26, 2021

This issue also probably hints at us needed to run some sort of better testing for parallel code.
E.g. some sort of fuzz testing or thread-sanitizer.

Loom can test all possible interleavings of synchronization primitives like mutexes and atomics.

@NathanSWard
Copy link
Contributor

(@mockersf I'd say this PR is safe to merge)

@cart cart added this to the Bevy 0.6 milestone Jul 1, 2021
@mockersf
Copy link
Member

mockersf commented Jul 5, 2021

could you mention in #2373 if you're ok with changing the license for your contribution?

@vgel
Copy link
Contributor Author

vgel commented Jul 6, 2021

could you mention in #2373 if you're ok with changing the license for your contribution?

Done!

@mockersf
Copy link
Member

mockersf commented Jul 6, 2021

bors r+

bors bot pushed a commit that referenced this pull request Jul 6, 2021
# Objective

Fixes a possible deadlock between `AssetServer::get_asset_loader` / `AssetServer::add_loader`

A thread could take the `extension_to_loader_index` read lock,
and then have the `server.loader` write lock taken in add_loader
before it can. Then add_loader can't take the extension_to_loader_index
lock, and the program deadlocks.

To be more precise:

## Step 1: Thread 1 grabs the `extension_to_loader_index` lock on lines 138..139:

https://github.com/bevyengine/bevy/blob/3a1867a92edc571b8f842bb1a96112dcbdceae4b/crates/bevy_asset/src/asset_server.rs#L133-L145

## Step 2: Thread 2 grabs the `server.loader` write lock on line 107:

https://github.com/bevyengine/bevy/blob/3a1867a92edc571b8f842bb1a96112dcbdceae4b/crates/bevy_asset/src/asset_server.rs#L103-L116

## Step 3: Deadlock, since Thread 1 wants to grab `server.loader` on line 141...:

https://github.com/bevyengine/bevy/blob/3a1867a92edc571b8f842bb1a96112dcbdceae4b/crates/bevy_asset/src/asset_server.rs#L133-L145

... and Thread 2 wants to grab 'extension_to_loader_index` on lines 111..112:

https://github.com/bevyengine/bevy/blob/3a1867a92edc571b8f842bb1a96112dcbdceae4b/crates/bevy_asset/src/asset_server.rs#L103-L116


## Solution

Fixed by descoping the extension_to_loader_index lock, since
`get_asset_loader` doesn't need to hold the read lock on the extensions map for the duration,
just to get a copyable usize. The block might not be needed,
I think I could have gotten away with just inserting a `copied()`
call into the chain, but I wanted to make the reasoning clear for
future maintainers.
@bors bors bot changed the title Fix AssetServer::get_asset_loader deadlock [Merged by Bors] - Fix AssetServer::get_asset_loader deadlock Jul 6, 2021
@bors bors bot closed this Jul 6, 2021
ostwilkens pushed a commit to ostwilkens/bevy that referenced this pull request Jul 27, 2021
# Objective

Fixes a possible deadlock between `AssetServer::get_asset_loader` / `AssetServer::add_loader`

A thread could take the `extension_to_loader_index` read lock,
and then have the `server.loader` write lock taken in add_loader
before it can. Then add_loader can't take the extension_to_loader_index
lock, and the program deadlocks.

To be more precise:

## Step 1: Thread 1 grabs the `extension_to_loader_index` lock on lines 138..139:

https://github.com/bevyengine/bevy/blob/3a1867a92edc571b8f842bb1a96112dcbdceae4b/crates/bevy_asset/src/asset_server.rs#L133-L145

## Step 2: Thread 2 grabs the `server.loader` write lock on line 107:

https://github.com/bevyengine/bevy/blob/3a1867a92edc571b8f842bb1a96112dcbdceae4b/crates/bevy_asset/src/asset_server.rs#L103-L116

## Step 3: Deadlock, since Thread 1 wants to grab `server.loader` on line 141...:

https://github.com/bevyengine/bevy/blob/3a1867a92edc571b8f842bb1a96112dcbdceae4b/crates/bevy_asset/src/asset_server.rs#L133-L145

... and Thread 2 wants to grab 'extension_to_loader_index` on lines 111..112:

https://github.com/bevyengine/bevy/blob/3a1867a92edc571b8f842bb1a96112dcbdceae4b/crates/bevy_asset/src/asset_server.rs#L103-L116


## Solution

Fixed by descoping the extension_to_loader_index lock, since
`get_asset_loader` doesn't need to hold the read lock on the extensions map for the duration,
just to get a copyable usize. The block might not be needed,
I think I could have gotten away with just inserting a `copied()`
call into the chain, but I wanted to make the reasoning clear for
future maintainers.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-Assets Load files from disk to use for things like images, models, and sounds C-Bug An unexpected or incorrect behavior S-Ready-For-Final-Review This PR has been approved by the community. It's ready for a maintainer to consider merging it
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants