Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use atomic instead of mutex in du function #13253

Closed
wants to merge 2 commits into from
Closed

Use atomic instead of mutex in du function #13253

wants to merge 2 commits into from

Conversation

BD103
Copy link
Contributor

@BD103 BD103 commented Jan 5, 2024

What does this PR try to resolve?

This optimizes the disk usage (du()) function in cargo-util to use an atomic to track the total size instead of a mutex.

The du function recursively checks the amount of bytes a folder. It uses a parallel iterator to read the size of multiple files at once. The variable total is a u64 that tracks this size. It was a mutex to prevent data races in the parallel iterator and make sure that the total size is accurately counted.

Because total is a simple number and is only being incremented in parallel, it can be turned into an AtomicU64. Atomics take up less bytes than mutexes. It also uses builtin CPU instructions to instrument locking, instead of using a manual mutex. This means it is generally faster, but results can vary.

Atomics require an Ordering to specify synchronization. After reading Rust Atomics and Locks, I chose Ordering::Relaxed. The code only needs to synchronize one atomic. By the time the walker.run returns, all the threads will be closed and the atomic's value will be final. If someone sees an issue with using Relaxed ordering, please say so now. The general goal is avoid data races. :)

How should we test and review this PR?

Data races are tricky to reproduce, which is why I would like the logic behind my change to be verified. Additionally, I noticed that du was being used in multiple other tests, like the global cache tracker, so I ran those as well.

I couldn't find any specific test that just tested du. I'd be willing to write one if it would be useful.

Additional information

Most of my knowledge about atomics and parallelism come from Rust Atomics and Locks. If recommend reading it if you want learn more about atomics and memory ordering.

@rustbot
Copy link
Collaborator

rustbot commented Jan 5, 2024

Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @weihanglo (or someone else) soon.

Please see the contribution instructions for more information. Namely, in order to ensure the minimum review times lag, PR authors and assigned reviewers should ensure that the review label (S-waiting-on-review and S-waiting-on-author) stays updated, invoking these commands when appropriate:

  • @rustbot author: the review is finished, PR author should check the comments and take action accordingly
  • @rustbot review: the author is ready for a review, this PR will be queued again in the reviewer's queue

@rustbot rustbot added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Jan 5, 2024
Copy link
Member

@weihanglo weihanglo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was an intentional change from Atomic64 for compatiblity reason. See #12981 and rust-lang/rust#117916 (comment).

@BD103
Copy link
Contributor Author

BD103 commented Jan 6, 2024

My apologies, I did not realize it was originally an atomic. Would you like me to make the atomic conditional as mentioned in the related comment?

@weihanglo
Copy link
Member

Would you like me to make the atomic conditional as mentioned in the related comment?

If the shim doesn't bring too much code and is easy to understand, it could work I guess :)

Comment on lines +97 to +98
#[cfg(target_has_atomic = "64")]
return Ok(total.load(Ordering::Relaxed));
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I needed to use a return here so it would compile. Rust doesn't know that the two cfg blocks are mutually exclusive.

@BD103 BD103 requested a review from weihanglo January 8, 2024 15:12
@ehuss
Copy link
Contributor

ehuss commented Jan 8, 2024

Thanks for the PR! However, I'm having trouble understanding why we would make this change. It appears to be quite a lot more code from before, which adds maintenance burden and complexity. In my performance testing, I was not able to determine a noticeable difference. Are there some benchmarks that indicate that this is an issue?

@BD103
Copy link
Contributor Author

BD103 commented Jan 8, 2024

Thanks for the PR! However, I'm having trouble understanding why we would make this change. It appears to be quite a lot more code from before, which adds maintenance burden and complexity. In my performance testing, I was not able to determine a noticeable difference. Are there some benchmarks that indicate that this is an issue?

No, I just happened to be browsing the codebase and noticed that the Arc<Mutex<u64>> could be replaced with an atomic. I don't have any real-world numbers to justify this change. I was so excited to find some way I could contribute back to Cargo that I didn't try looking further into why the code used a mutex, nor how much it affected performance.

I'm going to close this, since the additional code complexity does not deem the small performance gain. I apologize for wasting your time.

@BD103 BD103 closed this Jan 8, 2024
@Eh2406
Copy link
Contributor

Eh2406 commented Jan 9, 2024

Can you please make a PR adding a comment explaining why it is a Arc<Mutex<u64>>?

@BD103
Copy link
Contributor Author

BD103 commented Jan 9, 2024

Can you please make a PR adding a comment explaining why it is a Arc<Mutex<u64>>?

This is a good idea. I created #13273.

bors added a commit that referenced this pull request Jan 10, 2024
Document why `du` function uses mutex

### What does this PR try to resolve?

After closing #13253, it [was suggested](#13253 (comment)) to document why the `du` function uses a `Mutex` instead of an `AtomicU64`. This will prevent others from making the same mistake I did. :)

### How should we test and review this PR?

N/A

### Additional information

N/A
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-review Status: Awaiting review from the assignee but also interested parties.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants