-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ICD] Re-activate subscription when receiving check-in message if subscription timeouts and schedule a subscription #37481
base: master
Are you sure you want to change the base?
Conversation
5423fec
to
84eeb84
Compare
84eeb84
to
0a45f2b
Compare
0a45f2b
to
64037cc
Compare
17bb029
to
82d1fda
Compare
src/app/ReadClient.cpp
Outdated
// If we are in Idle state, it means previously device cannot be reached | ||
// If we are not in the above, simply do nothing. | ||
// If we are in the above, we should trigger immediate resubscription. | ||
VerifyOrReturn(IsInactiveICDSubscription() || IsIdle()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So this is trying to handle the case where:
SendAutoResubscribeRequest
was sent- It was the version that takes a ScopedNodeId, not an existing session.
- Our attempt to establish a session has not succeeded yet.
Right?
If that's accurate, it seems to me that we should have a state other than "idle" that describes this situation, and we should test for that state here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the 1/2/3 is correct description. I just think through multiple situations as below.
Assume read client fails to create subscription during case session establishement stage or subscription priming stage via SendAutoResubscribeRequest API with ScopedNodeId and goes to resubscribe with back-off algorithm because lit icd has powered off, once the device becomes active, controller receives the corresponding check-in message, client's scheduled subscription cannot be re-activated immediately. Intuitively, it is better that a resubscription can be triggered immeidately, right?
Another scenario is regarding liveness timeout for LIT, currently we block resubscription when the liveness timeout happens with CHIP_ERROR_LIT_SUBSCRIBE_INACTIVE_TIMEOUT, then schedule a resubscription when check-in message is received, but, what if the home network stuck/disrupt for a period, check-in message cannot be sent to client during that period, this blocking behavior may exacerbate subscription recovery delays. The removal of this blocking constraint is hypothesized to mitigate such delays.
Proposed Resolution:
A unified approach is proposed to optimize subscription recovery:
-- The blocking of resubscription logic during auto-subscription timeouts shall be eliminated.
-- Upon reception of a check-in message, immediate subscription activation shall be implemented.
-- Specifically, for LIT ICD devices experiencing timeouts:
The existing subscription shall be terminated with the CHIP_ERROR_LIT_SUBSCRIBE_INACTIVE_TIMEOUT error code, thereby clearing the subscription.
A resubscription process shall be scheduled.
The client state shall be transitioned to InactiveICDSubscription.
-- Upon receipt of a check-in message, the OnActiveModeNotification event shall be triggered.
The client shall evaluate its current state against InactiveICDSubscription.
If the client is in the InactiveICDSubscription state, the case session shall be cleared, and an immediate resubscription shall be initiated.
To accommodate dynamic transitions between SIT and LIT operating modes:
-- A dedicated attribute, mPeerOperatingMode, shall be introduced to accurately track the peer device's operating mode.
-- The existing mIsPeerLIT attribute shall not be modified during LIT to SIT transitions. This attribute shall solely indicate the device's registration with an ICD token and check-in capability.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the point of the "blocking resubscription" behavior was that for a LIT ICD the chance of a resubscribe attempt not triggered by a checkin succeeding is 0, because the sleepy is asleep. So it's just a waste of network packets....
82d1fda
to
c770997
Compare
PR #37481: Size comparison from 92f9f0b to c770997 Full report (10 builds for cc32xx, nrfconnect, qpg, stm32, tizen)
|
PR #37481: Size comparison from 92f9f0b to a103e4f Full report (72 builds for bl602, bl702, bl702l, cc13x4_26x4, cc32xx, cyw30739, efr32, esp32, linux, nrfconnect, nxp, psoc6, qpg, stm32, telink, tizen)
|
Assume read client fails to create subscription during case session establishement stage or subscription priming stage via SendAutoResubscribeRequest API with ScopedNodeId and goes to resubscribe with back-off algorithm because lit icd has powered off, once the device becomes active, controller receives the corresponding check-in message, client's scheduled subscription cannot be re-activated immediately. Intuitively, it is better that a resubscription can be triggered immeidately, right?
Another scenario is regarding liveness timeout for LIT, currently we block resubscription when the liveness timeout happens with CHIP_ERROR_LIT_SUBSCRIBE_INACTIVE_TIMEOUT, then schedule a resubscription when check-in message is received, but, what if the home network stuck/disrupt for a period, check-in message cannot be sent to client during that period, this blocking behavior may exacerbate subscription recovery delays. The removal of this blocking constraint is hypothesized to mitigate such delays.
Proposed Resolution:
A unified approach is proposed to optimize subscription recovery:
-- The blocking of resubscription logic during auto-subscription timeouts shall be eliminated.
-- Upon reception of a check-in message, immediate subscription activation shall be implemented.
-- Specifically, for LIT ICD devices experiencing timeouts:
-- Upon receipt of a check-in message, the OnActiveModeNotification event shall be triggered.
To accommodate dynamic transitions between SIT and LIT operating modes:
-- A dedicated attribute, mPeerOperatingMode, shall be introduced to accurately track the peer device's operating mode.
-- The existing mIsPeerLIT attribute shall not be modified during LIT to SIT transitions. This attribute shall solely indicate the device's registration with an ICD token and check-in capability.
Testing
Drafting