-
Notifications
You must be signed in to change notification settings - Fork 51
Fix the deadlock between TimerExpireCheck() and PfcpXactDelete() #42
Fix the deadlock between TimerExpireCheck() and PfcpXactDelete() #42
Conversation
shugo-h
commented
Nov 4, 2021
- Move TimerBlk::expireFunc call out of the mutex critical section.
- Add a expiration check of PfcpXact::timerHolding in PfcpXactTimeout().
- Move TimerBlk::expireFunc call out of the mutex critical section. - Add a expiration check of PfcpXact::timerHolding in PfcpXactTimeout().
@@ -664,7 +664,7 @@ Status PfcpXactTimeout(uint32_t index, uint32_t event, uint8_t *type) { | |||
PfcpXactDelete(xact); | |||
return STATUS_ERROR; | |||
} | |||
} else if (event == globalHoldingEvent) { | |||
} else if (event == globalHoldingEvent && TimerIsExpired(xact->timerHolding)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PfcpXactTimeout()
is used to handle any timeout event, that is sent by UPF when the timer expired.
So, I think maybe you don't need to check this timer is expired or not again?
Tks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for commenting.
I think better that timer expiration should be checked again here.
The xact->timerHolding
during processing PfcpXactTimeout()
is possible to have been updated by just previous UPF_EVENT_N4_MESSAGE
event, for instance, receiving a retransmitted PFCP Session Establishment Request from SMF.
Such xact
shall be deleted in the next UPF_EVENT_N4_T3_HOLDING
event processing even though its timerHolding has already been updated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@shugo-h
Thanks for your explanation.
Do you have any suggestions on how to reappear the deadlock mentioned in issue 268?
now I have several ideas but I‘m hard to pick the better one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about this?;
- Make many PDU sessions (e.g. 30 sessions or more) established at almost same moment. This will cause almost simultaneous PfcpXact timer expirations.
- Aim at the expiration time of the PfcpXact timer started in step 1, and make all PDU sessions released.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@shugo-h
That looks great!
I would try the method you provided for validation.
Thanks a lot.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Step 2 may not be necessary.
Step 1 will cause many UPF_EVENT_N4_T3_HOLDING
events almost simultaneously with holding mutex lock.
I think a PfcpXactTimeout()
raised by one of those events may fall into the deadlock in PfcpXactDelete()
.
But it's better also to do step 2 in order to make event queue full.