Skip to content

Commit

Permalink
Change criteria for deciding if discontinuity
Browse files Browse the repository at this point in the history
This fixes an intermittent crash ("Non-initial ATTACHED state change
with resumed == false should have a reason") in the integration tests
which happened because the manager sometimes processed the initial
ATTACHED contributor state change after the `attach()` call completed.
I’ve followed my first suggestion in [1] for how to address this.

Resolves #121.

[1] ably/specification#239
  • Loading branch information
lawrence-forooghian committed Nov 25, 2024
1 parent fe87127 commit 8b5aa47
Show file tree
Hide file tree
Showing 2 changed files with 24 additions and 9 deletions.
10 changes: 7 additions & 3 deletions Sources/AblyChat/RoomLifecycleManager.swift
Original file line number Diff line number Diff line change
Expand Up @@ -164,6 +164,7 @@ internal actor DefaultRoomLifecycleManager<Contributor: RoomLifecycleContributor
for (contributor, subscription) in subscriptions {
// This `@Sendable` is to make the compiler error "'self'-isolated value of type '() async -> Void' passed as a strongly transferred parameter; later accesses could race" go away. I don’t hugely understand what it means, but given the "'self'-isolated value" I guessed it was something vaguely to do with the fact that `async` actor initializers are actor-isolated and thought that marking it as `@Sendable` would sever this isolation and make the error go away, which it did 🤷. But there are almost certainly consequences that I am incapable of reasoning about with my current level of Swift concurrency knowledge.
group.addTask { @Sendable [weak self] in
// We intentionally wait to finish processing one state change before moving on to the next; this means that when we process an ATTACHED state change, we can be sure that the current `hasBeenAttached` annotation correctly reflects the contributor’s previous state changes.
for await stateChange in subscription {
await self?.didReceiveStateChange(stateChange, forContributor: contributor)
}
Expand Down Expand Up @@ -264,7 +265,7 @@ internal actor DefaultRoomLifecycleManager<Contributor: RoomLifecycleContributor
// TODO: Not clear whether there can be multiple or just one (asked in https://github.com/ably/specification/pull/200/files#r1781927850)
var pendingDiscontinuityEvents: [ARTErrorInfo] = []
var transientDisconnectTimeout: TransientDisconnectTimeout?
/// Whether a CHA-RL1f call to `attach()` on the contributor has previously succeeded.
/// Whether a state change to `ATTACHED` has already been observed for this contributor.
var hasBeenAttached: Bool

var hasTransientDisconnectTimeout: Bool {
Expand Down Expand Up @@ -384,6 +385,7 @@ internal actor DefaultRoomLifecycleManager<Contributor: RoomLifecycleContributor
///
/// A contributor state change is considered handled once the manager has performed all of the side effects that it will perform as a result of receiving this state change. Specifically, once:
///
/// - (if the state change is ATTACHED) the manager has recorded that an ATTACHED state change has been observed for the contributor
/// - the manager has recorded all pending discontinuity events provoked by the state change (you can retrieve these using ``testsOnly_pendingDiscontinuityEventsForContributor(at:)``)
/// - the manager has performed all status changes provoked by the state change (this does _not_ include the case in which the state change provokes the creation of a transient disconnect timeout which subsequently provokes a status change; use ``testsOnly_subscribeToHandledTransientDisconnectTimeouts()`` to find out about those)
/// - the manager has performed all contributor actions provoked by the state change, namely calls to ``RoomLifecycleContributorChannel.detach()`` or ``RoomLifecycleContributor.emitDiscontinuity(_:)``
Expand Down Expand Up @@ -456,8 +458,11 @@ internal actor DefaultRoomLifecycleManager<Contributor: RoomLifecycleContributor
await contributor.emitDiscontinuity(reason)
}
case .attached:
let hadAlreadyAttached = contributorAnnotations[contributor].hasBeenAttached
contributorAnnotations[contributor].hasBeenAttached = true

if hasOperationInProgress {
if !stateChange.resumed, contributorAnnotations[contributor].hasBeenAttached {
if !stateChange.resumed, hadAlreadyAttached {
// CHA-RL4b1
logger.log(message: "Recording pending discontinuity event for contributor \(contributor)", level: .info)

Expand Down Expand Up @@ -801,7 +806,6 @@ internal actor DefaultRoomLifecycleManager<Contributor: RoomLifecycleContributor
do {
logger.log(message: "Attaching contributor \(contributor)", level: .info)
try await contributor.channel.attach()
contributorAnnotations[contributor].hasBeenAttached = true
} catch let contributorAttachError {
let contributorState = await contributor.channel.state
logger.log(message: "Failed to attach contributor \(contributor), which is now in state \(contributorState), error \(contributorAttachError)", level: .info)
Expand Down
23 changes: 17 additions & 6 deletions Tests/AblyChatTests/DefaultRoomLifecycleManagerTests.swift
Original file line number Diff line number Diff line change
Expand Up @@ -1422,18 +1422,29 @@ struct DefaultRoomLifecycleManagerTests {
#expect(discontinuity === contributorStateChange.reason)
}

// @specOneOf(1/2) CHA-RL4b1 - Tests the case where the contributor has been attached previously
// @specPartial CHA-RL4b1 - Tests the case where the contributor has been attached previously (TODO: I have changed the criteria for deciding whether an ATTACHED status change represents a discontinuity, to be based on whether there was a previous ATTACHED state change instead of whether the `attach()` call has completed; see https://github.com/ably/specification/issues/239 and change this back to specOneOf(1/2) once we’re aligned with spec again)
@Test
func contributorAttachEvent_withResumeFalse_withOperationInProgress_withContributorAttachedPreviously_recordsPendingDiscontinuityEvent() async throws {
// Given: A DefaultRoomLifecycleManager, with a room lifecycle operation in progress, and which has a contributor for which a CHA-RL1f call to `attach()` has succeeded
// Given: A DefaultRoomLifecycleManager, with a room lifecycle operation in progress, and which has a contributor for which it has previously received an ATTACHED state change
let contributorDetachOperation = SignallableChannelOperation()
let contributor = createContributor(attachBehavior: .success, detachBehavior: contributorDetachOperation.behavior)
let manager = await createManager(
contributors: [contributor]
)

// This is to satisfy "a CHA-RL1f call to `attach()` has succeeded"
try await manager.performAttachOperation()
// This is to satisfy "for which it has previously received an ATTACHED state change"
let previousContributorStateChange = ARTChannelStateChange(
// `previous`, `reason`, and `resumed` are arbitrary, but for realism let’s simulate an initial ATTACHED
current: .attached,
previous: .attaching,
event: .attached,
reason: nil,
resumed: false
)

await waitForManager(manager, toHandleContributorStateChange: previousContributorStateChange) {
await contributor.channel.emitStateChange(previousContributorStateChange)
}

// This is to put the manager into the DETACHING state, to satisfy "with a room lifecycle operation in progress"
let roomStatusSubscription = await manager.onRoomStatusChange(bufferingPolicy: .unbounded)
Expand Down Expand Up @@ -1464,10 +1475,10 @@ struct DefaultRoomLifecycleManagerTests {
contributorDetachOperation.complete(behavior: .success)
}

// @specOneOf(2/2) CHA-RL4b1 - Tests the case where the contributor has not been attached previously
// @specPartial CHA-RL4b1 - Tests the case where the contributor has not been attached previously (TODO: I have changed the criteria for deciding whether an ATTACHED status change represents a discontinuity, to be based on whether there was a previous ATTACHED state change instead of whether the `attach()` call has completed; see https://github.com/ably/specification/issues/239 and change this back to specOneOf(2/2) once we’re aligned with spec again)
@Test
func contributorAttachEvent_withResumeFalse_withOperationInProgress_withContributorNotAttachedPreviously_doesNotRecordPendingDiscontinuityEvent() async throws {
// Given: A DefaultRoomLifecycleManager, with a room lifecycle operation in progress, and which has a contributor for which a CHA-RL1f call to `attach()` has not previously succeeded
// Given: A DefaultRoomLifecycleManager, with a room lifecycle operation in progress, and which has a contributor for which it has not previously received an ATTACHED state change
let contributor = createContributor()
let manager = await createManager(
forTestingWhatHappensWhenCurrentlyIn: .attachingDueToAttachOperation(attachOperationID: UUID()), // case and ID arbitrary, just care that an operation is in progress
Expand Down

0 comments on commit 8b5aa47

Please sign in to comment.