Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[skip changelog] pinmfs: mitigate slow mfs writes when it triggers #10623

Merged
merged 2 commits into from
Dec 17, 2024

Conversation

hsanjuan
Copy link
Contributor

This mitigates slow mfs writes when the pinmfs daemon calls mfs.RootNode()

When writing lots of files to MFS, this call triggers a mfs directory cache sync operations. The cache grows forever and is unbounded. The more files added to a directory, the longer it takes. In the meantime, writing to mfs is locked.

The pinmfs, even when no remote pinning services are configured, will trigger this issue. When RootNode() takes more than 30 seconds, the issue will be triggered continuously causing a write-deadlock onto MFS.

This commit does not fix the fact that if you write 1M items into an MFS directory, the first time that the directory is traversed it will still have to sync those 1M items. It does at least prevent writes from stalling after about ~6000 items.

This mitigates slow mfs writes when the pinmfs daemon calls mfs.RootNode()

When writing lots of files to MFS, this call triggers a mfs directory cache
sync operations. The cache grows forever and is unbounded. The more files
added to a directory, the longer it takes. In the meantime, writing to mfs is
locked.

The pinmfs, even when no remote pinning services are configured, will trigger
this issue. When RootNode() takes more than 30 seconds, the issue will be triggered
continuously causing a write-deadlock onto MFS.

This commit does not fix the fact that if you write 1M items into an MFS
directory, the first time that the directory is traversed it will still have
to sync those 1M items. It does at least prevent writes from stalling after
about ~6000 items.
@hsanjuan hsanjuan added the skip/changelog This change does NOT require a changelog entry label Dec 12, 2024
@hsanjuan hsanjuan self-assigned this Dec 12, 2024
@hsanjuan hsanjuan requested a review from a team as a code owner December 12, 2024 15:41
@hsanjuan
Copy link
Contributor Author

If we take ipfs/boxo#751, we will need something that triggers Flushing regularly or a way to keep the cache size under control before it is too big.

Copy link
Contributor

@gammazero gammazero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good, but might be good to have a test that would stall writes without this PR.

Copy link
Contributor

@gammazero gammazero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀

@gammazero gammazero merged commit 898f024 into master Dec 17, 2024
14 checks passed
@gammazero gammazero deleted the fix/mfs-getting-slow branch December 17, 2024 20:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
skip/changelog This change does NOT require a changelog entry
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants