-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Add experimental support for sharding event persister. #8170
Conversation
0df5060
to
e151ff2
Compare
9c8eb15
to
609bc17
Compare
609bc17
to
ac494a8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't see anything glaringly incorrect, but did leave quite a few comments. I'm guessing documentation won't be updated until this is not experimental?
@@ -923,7 +923,8 @@ async def backfill(self, dest, room_id, limit, extremities): | |||
) | |||
) | |||
|
|||
await self._handle_new_events(dest, ev_infos, backfilled=True) | |||
if ev_infos: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this if-statement just an optimization?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh err, somewhere along the lines I think something got very unhappy about empty lists. I forget the details now or if its even necessary 😕
instance = self.config.worker.events_shard_config.get_instance(room_id) | ||
if instance != self._instance_name: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Edit: I wrote the below before realizing that instance
is also used in the _send_events
call below, I thought it was worth leaving though in case it jiggles a good idea loose:
It looks like this pattern is used quite a bit, in the comments for
ShardedWorkerHandlingConfig
it says to prefershould_handle
, which seems like it could be used here:if self.config.worker.events_shard_config.should_handle(self._instance_name, room_id):Although I think some of this could be simplified more if
ShardedWorkerHandlingConfig
knew what the current instance was (maybeshould_handle
wouldn't need theinstance
passed in?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nods. The problem with giving ShardedWorkerHandlingConfig
the current instance name is that I don't think we know what the instance name is during config parsing (outside of the worker config parsing).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, I think we do need to use should_handle
first to technically conform to the docs of ShardedWorkerHandlingConfig
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Though I guess the fact that we go on to do a HTTP replication hit means that get_instance
has to work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is OK to use get_instance
here, that and should_handle
should have matching logic after all!
Co-authored-by: Patrick Cloke <clokep@users.noreply.github.com>
Yeah, I really really don't want people using this at all. Not even in tests (I have a branch with working sytests but they go so slowly) |
@erikjohnston So I think this looks good! Not sure if you're looking for a ✅ or want to add tests? |
I think this is related to #7986. |
So I was waiting for SyTests to land, but actually it turns out that with this current implementation they take forever. This is because of the Given this is completely undocumented I'm tempted to merge as is, and then require tests for the next phase (which is augmenting the |
This reverts commit 82c1ee1.
This is *not* ready for production yet. Caveats: 1. We should write some tests... 2. The stream token that we use for events can get stalled at the minimum position of all writers. This means that new events may not be processed and e.g. sent down sync streams if a writer isn't writing or is slow.
* commit '0d4f614fd': Refactor `_get_e2e_device_keys_for_federation_query_txn` (#8225) Add experimental support for sharding event persister. (#8170) Add /user/{user_id}/shared_rooms/ api (#7785) Do not try to store invalid data in the stats table (#8226) Convert the main methods run by the reactor to async. (#8213)
This is not ready for production yet. Caveats:
Probably worth looking at this commit by commit. There is a
FIXME
inImplement config...
that is dealt with by the federation handler refactor,