-
-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: deadlock while racing ipfs dag import
and ipfs repo gc
#9755
Conversation
This fixes a deadlock introduced in 1457b4f. We can't use the coreapi here because it will try to take the PinLock (RLock) again, so revert this small part of 1457b4f. This used cause a deadlock when concurrently running `ipfs dag import` concurrently with the GC. The bug is that `ipfs dag import` takes an RLock with the PinLock. *the cars are imported, leaving a wide window of time* Then GC Takes a Lock on that same RWMutex while taking the GC Lock (it blocks because it waits for the RLock to be released). Then the car imports are finished and `ipfs dag import` tries to aqcuire the PinLock (doing an RLock) again in `Api().Pin`. However at this point the RWMutex is starved, the runtime put a fence in front of RLocks if a Lock has been waiting for too lock (else you could have an endless stream of RLock / RUnlock forever delaying a Lock to ever go through). The issue is that `ipfs dag import`'s original RLock which is blocking everyone will be released once it returns, which only happens when `Api().Pin` completes. So we have a deadlock (ABA kind ?), because `ipfs dag import` waits on the GC Lock, which waits on `ipfs dag import`. Calling the Pinner directly does not acquire the PinLock again, and thus does not have this issue.
8689881
to
74010a8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make sense to me as a fix.
Imho though, the real way would be to change CoreApi to allow more control over those things. The current API is quite limiting.
Without seeing code I think this is a bad idea, how will that manifest I think it make sense to have multiple APIs for internal and external things. There very well could be some better solution I'm missing here tho. |
@Jorropo alternatively, that Fixing that would allow using properly the internal APIs, take the lock properly ... and let me override pinner and blockstore properly :-D |
Make sense, make it take a |
This fixes a deadlock introduced in 1457b4f.
We can't use the coreapi here because it will try to take the PinLock (RLock) again, so revert this small part of 1457b4f.
This used cause a deadlock when concurrently running
ipfs dag import
concurrently with the GC.The bug is that
ipfs dag import
takes an RLock with the PinLock. the cars are imported, leaving a wide window of time Then GC Takes a Lock on that same RWMutex while taking the GC Lock (it blocks because it waits for the RLock to be released). Then the car imports are finished andipfs dag import
tries to aqcuire the PinLock (doing an RLock) again inApi().Pin
.However at this point the RWMutex is starved, the runtime put a fence in front of RLocks if a Lock has been waiting for too lock (else you could have an endless stream of RLock / RUnlock forever delaying a Lock to ever go through).
The issue is that
ipfs dag import
's original RLock which is blocking everyone will be released once it returns, which only happens whenApi().Pin
completes.So we have a deadlock (ABA kind ?), because
ipfs dag import
waits on the GC Lock, which waits onipfs dag import
.Calling the Pinner directly does not acquire the PinLock again, and thus does not have this issue.