-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Fix exception thrown when attempting to add an appservice sender
to user_directory
#11026
Conversation
Fix #11025 Before in [`user_directory_handler._handle_deltas`, we just checked for `is_support_user(user_id)`](https://github.com/matrix-org/synapse/pull/10960/files#diff-e02a9a371e03b8615b53c6b6552f76fc7d3ef58931ca64d28b3512caf305449fL232) which works just fine. Now with #10960, we [call `should_include_local_user_in_dir`](https://github.com/matrix-org/synapse/pull/10960/files#diff-e02a9a371e03b8615b53c6b6552f76fc7d3ef58931ca64d28b3512caf305449fR229) which does a [bunch of additional checks](https://github.com/matrix-org/synapse/blob/e79ee48313404abf8fbb7c88361e4ab1efa29a81/synapse/storage/databases/main/user_directory.py#L382-L398) which aren't all compatible with the main appservice sender (main bridge user/sender). More specifically, we can't check the `users` database table for whether the user is deactivated. In the `should_include_local_user_in_dir` checks, we should return early if we encounter a main appservice sender before the incompatible checks.
…rowing-exception-when-interacting-with-appservice-sender
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh dear. Thank you for spotting and fixing this!
I don't fully understand how this was triggered though---I think synapse must have been processing a membership event for an appservice sender? I'm not sure what that has to do with notifying the sender.
changelog.d/11026.bugfix
Outdated
@@ -0,0 +1 @@ | |||
Fix exception thrown when attempting to notify appservice `sender` of new messages. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be worth clarifying that this fixes a bug that's only shown up on develop
and not in any released builds.
I think if this exception was thrown then we would fail to make progress in the user directory background process, which is more user-visible. Maybe something like this?
Fix exception thrown when attempting to notify appservice `sender` of new messages. | |
Fix a bug in development builds where the user directory would stop updating after an appservice sender changed room membership. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the bug has only shown up on develop and isn't affecting a release, then the changelog should be the same as the PR that introduced the bug.
Fix exception thrown when attempting to notify appservice `sender` of new messages. | |
Fix a long-standing bug where rebuilding the user directory wouldn't exclude support and disabled users. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like if we just have the same changelog as the regression PR, we're missing out on explaining the additional behavioral change here, "Exclude application service sender membership from user directory." Although then this gets into the "what changed" vs "why changed"
Fix exception thrown when attempting to notify appservice `sender` of new messages. | |
Exclude application service sender membership from user directory to fix a bug stopping the user directory from being populated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like if we just have the same changelog as the regression PR, we're missing out on explaining the additional behavioral change here.
Agreed. I hadn't grokked that the application service sender might not be covered by get_if_app_services_interested_in_user
, and neither did the old code before I started mucking about with this!
Could I get another pair of eyes from @matrix-org/synapse-core to check my understanding of what's happened here? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly I'm not sure the changes proposed are the right solution to the problem you're trying to fix.
changelog.d/11026.bugfix
Outdated
@@ -0,0 +1 @@ | |||
Fix exception thrown when attempting to notify appservice `sender` of new messages. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the bug has only shown up on develop and isn't affecting a release, then the changelog should be the same as the PR that introduced the bug.
Fix exception thrown when attempting to notify appservice `sender` of new messages. | |
Fix a long-standing bug where rebuilding the user directory wouldn't exclude support and disabled users. |
@@ -383,6 +383,10 @@ async def should_include_local_user_in_dir(self, user: str) -> bool: | |||
"""Certain classes of local user are omitted from the user directory. | |||
Is this user one of them? | |||
""" | |||
# The main app service sender isn't usually contactable, so exclude them |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand this comment. What's the "main app service sender"? And why isn't it usually contactable? I initially thought it was the sender_localpart
of the application service but some bridges (e.g. the WhatsApp one) can be configured through a DM with that user, which doesn't fit into what I would consider not contactable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've updated the comment to explain what the "appservice sender" is and the nuance around some bridges being contactable where others are not and our decision to opt to exclude them for now.
# We're opting to exclude the appservice sender (user defined by the
# `sender_localpart` in the appservice registration) even though
# technically it could be DM-able. In the future, this could potentially
# be configurable per-appservice whether the appservice sender can be
# contacted.
It's pretty hard to define this user succinctly, or at all and have it be clear and obvious. Any ideas?
This is the single user defined by sender_localpart
but it gets converted into a full MXID and set as the sender
on an ApplicationService
,
synapse/synapse/config/appservice.py
Lines 111 to 115 in 593eeac
localpart = as_info["sender_localpart"] | |
if urlparse.quote(localpart) != localpart: | |
raise ValueError("sender_localpart needs characters which are not URL encoded.") | |
user = UserID(localpart, hostname) | |
user_id = user.to_string() |
synapse/synapse/config/appservice.py
Lines 165 to 171 in 593eeac
return ApplicationService( | |
token=as_info["as_token"], | |
hostname=hostname, | |
url=as_info["url"], | |
namespaces=as_info["namespaces"], | |
hs_token=as_info["hs_token"], | |
sender=user_id, |
An "application service user" is different as that is any MXID that matches the exclusive regexes in the application service registration file.
sender
sender
to user_directory
sender
to user_directory
sender
to user_directory
sender
to user_directory
sender
to user_directory
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for umming and aahing on this one. It's been tricky to wrap my head around.
I think we should go ahead with these changes and plan to improve the situation in the future for app services.
I'd like to see some tweaks to the test coverage. I think we should ensure that the appservice sender created by the test doesn't match the appservice's regex (to make sure the test hits the new code path that @MadLittleMods added). I think I'd also like to see a test analogous to test_excludes_appservices_user
which joins the appservice sender to a room.
I think it might be easiest if I give that a go myself and chuck it on the end of this PR. I hope that's okay @MadLittleMods. And to reiterate, I'm very grateful to you for spotting this in the first place!
I don't think this actually improves coverage as such, but I'll sleep better at night having this case checked!
That's the current behaviour (I didn't realise they typically don't match their own user regex).
Arg. Two tests are failing. One in the handler, for reasons I don't understand yet. A second in the dir rebuild code. That one fails because I didn't update the test after I tried to make appservice users not excluded. ... But unfortunately they're not fully included either, because I made the room processing step only add user_directory entries for remote users. I thought that was safer, clearer, and more robust in terms of avoiding per-room name leaks. But appservice senders don't have entries in the users table, so they're missed. I can add another background process to explicitly add AS senders to the directory. I don't like adding this special case here but... I think I prefer that to muddling the already scary logic of the room and user rebuild steps. I admit it's a niche edge case too. (E.g. if an appservice is configured but belongs to 0 rooms and "search all users" is configured on, I'd expect to be able to search for the appservice in the directory.) |
This one got a bit confused when I was trying to get this done last night. I took it forward on #11053 . Sorry for muddling things here. |
Fix exception thrown when attempting to add an appservice
sender
to `user_directory``.Fix #11025
This regressed in #10960 where we call
should_include_local_user_in_dir
which does a bunch of additional checks which aren't all compatible with the main appservicesender
(main bridge user/sender). More specifically when we callget_user_deactivated_status
for an application service sender, we can't check theusers
database table for whether the user is deactivated.Before #10960, in
user_directory_handler._handle_deltas
, we just checked foris_support_user(user_id)
which works just fine.Exception thrown:
Dev notes
Pull Request Checklist
EventStore
toEventWorkerStore
.".code blocks
.Pull request includes a sign off