Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add back fast path for non-gappy syncs #17064

Merged
merged 3 commits into from
Apr 8, 2024
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions changelog.d/17064.bugfix
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Fix various long-standing bugs which could cause incorrect state to be returned from `/sync` in certain situations.
28 changes: 28 additions & 0 deletions synapse/handlers/sync.py
Original file line number Diff line number Diff line change
Expand Up @@ -1259,6 +1259,34 @@ async def _compute_state_delta_for_incremental_sync(
await_full_state = True
lazy_load_members = False

# For a non-gappy sync if the events in the timeline are simply a linear
# chain (i.e. no merging/branching of the graph), then we know the state
# delta between the end of the previous sync and start of the new one is
# empty.
#
# c.f. #16941 for an example of why we can't do this for all non-gappy
# syncs.
is_linear_timeline = all(len(e.prev_event_ids()) <= 1 for e in batch.events)
if is_linear_timeline and not batch.limited:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

have you considered how this behaves in longer-lived forks:

             E1
           ↗    ↖
          |      S2
          |      ↑
          E3     |
          ↑      |
        --|------|----   <- prev sync
          |      |
          E4     E5       
          ↑      |
        --|------|----   <- this sync
          |      |
           ↖    /
             E6                (the distant future)

E4 and E5 both have single prev events, but I'm not convinced it is safe to drop the state delta between the forks here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The state returned is currently (timeline_start | timeline_end) - previous_timeline_end - timeline_contains. If we have a linear chain with no gaps, my assumption is that timeline_start == previous_timeline_end and timeline_end == timeline_start + timeline_contains, which then all cancels out.

But bleurghghghghghg you're right that we need to actually check that all the events are in the same chain and point to the previous timeline end. BLEURGH.

(Though does the current code sensibly work in this case?)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Though does the current code sensibly work in this case?)

WHO KNOWS

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More constructively: it would be good to add a test for this case.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having thought about this over lunch:

If we have a linear chain with no gaps, my assumption is that timeline_start == previous_timeline_end

I think with long lived forks this may or may not be true, depending on if the the end and start are part of the same fork. In the present code, we'll come to the wrong answer if that is the case, and in the new code it will come to the wrong answer either way.

Given that, I'm somewhat tempted to take accept that and fix the performance regression. The other option is to do an extra DB hit to check if the event ID corresponding to previous_timeline_end matches the prev event of the start of the timeline.

I think if we want to remove all these edge cases we'll need to change all this to try and use the current state (based on the extremities at the time), though that's quite a big change (but perhaps we can do it for sliding sync?)

state_ids: StateMap[str] = {}
if lazy_load_members:
if members_to_fetch and batch.events:
# We're lazy-loading, so the client might need some more
# member events to understand the events in this timeline.
# So we fish out all the member events corresponding to the
# timeline here. The caller will then dedupe any redundant
# ones.

state_ids = await self._state_storage_controller.get_state_ids_for_event(
batch.events[0].event_id,
# we only want members!
state_filter=StateFilter.from_types(
(EventTypes.Member, member) for member in members_to_fetch
),
await_full_state=False,
)
return state_ids

if batch:
state_at_timeline_start = (
await self._state_storage_controller.get_state_ids_for_event(
Expand Down
Loading