Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Conversation

MadLittleMods
Copy link
Contributor

@MadLittleMods MadLittleMods commented Oct 7, 2021

Fix exception thrown when attempting to add an appservice sender to `user_directory``.

Fix #11025

This regressed in #10960 where we call should_include_local_user_in_dir which does a bunch of additional checks which aren't all compatible with the main appservice sender (main bridge user/sender). More specifically when we call get_user_deactivated_status for an application service sender, we can't check the users database table for whether the user is deactivated.

Before #10960, in user_directory_handler._handle_deltas, we just checked for is_support_user(user_id) which works just fine.


Exception thrown:

2021-10-07 16:59:56,530 - synapse.metrics.background_process_metrics - 215 - ERROR - user_directory.notify_new_event-1147 - Background process 'user_directory.notify_new_event' threw an exception
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/site-packages/synapse/metrics/background_process_metrics.py", line 213, in run
    return await maybe_awaitable(func(*args, **kwargs))
  File "/usr/local/lib/python3.8/site-packages/synapse/handlers/user_directory.py", line 119, in process
    await self._unsafe_process()
  File "/usr/local/lib/python3.8/site-packages/synapse/handlers/user_directory.py", line 170, in _unsafe_process
    await self._handle_deltas(deltas)
  File "/usr/local/lib/python3.8/site-packages/synapse/handlers/user_directory.py", line 229, in _handle_deltas
    ) or await self.store.should_include_local_user_in_dir(state_key)
  File "/usr/local/lib/python3.8/site-packages/synapse/storage/databases/main/user_directory.py", line 389, in should_include_local_user_in_dir
    if await self.get_user_deactivated_status(user):
  File "/usr/local/lib/python3.8/site-packages/twisted/internet/defer.py", line 1657, in _inlineCallbacks
    result = current_context.run(
  File "/usr/local/lib/python3.8/site-packages/twisted/python/failure.py", line 500, in throwExceptionIntoGenerator
    return g.throw(self.type, self.value, self.tb)
  File "/usr/local/lib/python3.8/site-packages/synapse/storage/databases/main/registration.py", line 929, in get_user_deactivated_status
    res = await self.db_pool.simple_select_one_onecol(
  File "/usr/local/lib/python3.8/site-packages/synapse/storage/database.py", line 1414, in simple_select_one_onecol
    return await self.runInteraction(
  File "/usr/local/lib/python3.8/site-packages/synapse/storage/database.py", line 686, in runInteraction
    result = await self.runWithConnection(
  File "/usr/local/lib/python3.8/site-packages/synapse/storage/database.py", line 791, in runWithConnection
    return await make_deferred_yieldable(
  File "/usr/local/lib/python3.8/site-packages/twisted/python/threadpool.py", line 238, in inContext
    result = inContext.theWork()  # type: ignore[attr-defined]
  File "/usr/local/lib/python3.8/site-packages/twisted/python/threadpool.py", line 254, in <lambda>
    inContext.theWork = lambda: context.call(  # type: ignore[attr-defined]
  File "/usr/local/lib/python3.8/site-packages/twisted/python/context.py", line 118, in callWithContext
    return self.currentContext().callWithContext(ctx, func, *args, **kw)
  File "/usr/local/lib/python3.8/site-packages/twisted/python/context.py", line 83, in callWithContext
    return func(*args, **kw)
  File "/usr/local/lib/python3.8/site-packages/twisted/enterprise/adbapi.py", line 293, in _runWithConnection
    compat.reraise(excValue, excTraceback)
  File "/usr/local/lib/python3.8/site-packages/twisted/python/deprecate.py", line 298, in deprecatedFunction
    return function(*args, **kwargs)
  File "/usr/local/lib/python3.8/site-packages/twisted/python/compat.py", line 404, in reraise
    raise exception.with_traceback(traceback)
  File "/usr/local/lib/python3.8/site-packages/twisted/enterprise/adbapi.py", line 284, in _runWithConnection
    result = func(conn, *args, **kw)
  File "/usr/local/lib/python3.8/site-packages/synapse/storage/database.py", line 786, in inner_func
    return func(db_conn, *args, **kwargs)
  File "/usr/local/lib/python3.8/site-packages/synapse/storage/database.py", line 554, in new_transaction
    r = func(cursor, *args, **kwargs)
  File "/usr/local/lib/python3.8/site-packages/synapse/storage/database.py", line 1467, in simple_select_one_onecol_txn
    raise StoreError(404, "No row found")
synapse.api.errors.StoreError: 404: No row found

Dev notes

SYNAPSE_TEST_LOG_LEVEL=INFO python -m twisted.trial tests.storage.test_user_directory

SYNAPSE_TEST_LOG_LEVEL=INFO python -m twisted.trial tests.handlers.test_user_directory

Pull Request Checklist

  • Pull request is based on the develop branch
  • Pull request includes a changelog file. The entry should:
    • Be a short description of your change which makes sense to users. "Fixed a bug that prevented receiving messages from other servers." instead of "Moved X method from EventStore to EventWorkerStore.".
    • Use markdown where necessary, mostly for code blocks.
    • End with either a period (.) or an exclamation mark (!).
    • Start with a capital letter.
  • Pull request includes a sign off
  • Code style is correct (run the linters)

Fix #11025

Before in [`user_directory_handler._handle_deltas`, we just checked for `is_support_user(user_id)`](https://github.com/matrix-org/synapse/pull/10960/files#diff-e02a9a371e03b8615b53c6b6552f76fc7d3ef58931ca64d28b3512caf305449fL232) which works just fine.
Now with #10960, we [call `should_include_local_user_in_dir`](https://github.com/matrix-org/synapse/pull/10960/files#diff-e02a9a371e03b8615b53c6b6552f76fc7d3ef58931ca64d28b3512caf305449fR229) which does a [bunch of additional checks](https://github.com/matrix-org/synapse/blob/e79ee48313404abf8fbb7c88361e4ab1efa29a81/synapse/storage/databases/main/user_directory.py#L382-L398) which aren't all compatible with the main appservice sender (main bridge user/sender). More specifically, we can't check the `users` database table for whether the user is deactivated.

In the `should_include_local_user_in_dir` checks, we should return early if we encounter a main appservice sender before the incompatible checks.
@MadLittleMods MadLittleMods added A-Application-Service Related to AS support T-Defect Bugs, crashes, hangs, security vulnerabilities, or other reported issues. labels Oct 7, 2021
@MadLittleMods MadLittleMods requested a review from a team as a code owner October 7, 2021 23:06
@MadLittleMods MadLittleMods added the X-Regression Something broke which worked on a previous release label Oct 7, 2021
…rowing-exception-when-interacting-with-appservice-sender
@DMRobertson DMRobertson self-assigned this Oct 8, 2021
Copy link
Contributor

@DMRobertson DMRobertson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh dear. Thank you for spotting and fixing this!

I don't fully understand how this was triggered though---I think synapse must have been processing a membership event for an appservice sender? I'm not sure what that has to do with notifying the sender.

@@ -0,0 +1 @@
Fix exception thrown when attempting to notify appservice `sender` of new messages.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be worth clarifying that this fixes a bug that's only shown up on develop and not in any released builds.

I think if this exception was thrown then we would fail to make progress in the user directory background process, which is more user-visible. Maybe something like this?

Suggested change
Fix exception thrown when attempting to notify appservice `sender` of new messages.
Fix a bug in development builds where the user directory would stop updating after an appservice sender changed room membership.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the bug has only shown up on develop and isn't affecting a release, then the changelog should be the same as the PR that introduced the bug.

Suggested change
Fix exception thrown when attempting to notify appservice `sender` of new messages.
Fix a long-standing bug where rebuilding the user directory wouldn't exclude support and disabled users.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like if we just have the same changelog as the regression PR, we're missing out on explaining the additional behavioral change here, "Exclude application service sender membership from user directory." Although then this gets into the "what changed" vs "why changed"

Suggested change
Fix exception thrown when attempting to notify appservice `sender` of new messages.
Exclude application service sender membership from user directory to fix a bug stopping the user directory from being populated.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like if we just have the same changelog as the regression PR, we're missing out on explaining the additional behavioral change here.

Agreed. I hadn't grokked that the application service sender might not be covered by get_if_app_services_interested_in_user, and neither did the old code before I started mucking about with this!

@DMRobertson DMRobertson removed their assignment Oct 8, 2021
@DMRobertson DMRobertson requested a review from a team October 8, 2021 10:00
@DMRobertson
Copy link
Contributor

Could I get another pair of eyes from @matrix-org/synapse-core to check my understanding of what's happened here?

@babolivier babolivier self-assigned this Oct 8, 2021
Copy link
Contributor

@babolivier babolivier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly I'm not sure the changes proposed are the right solution to the problem you're trying to fix.

@@ -0,0 +1 @@
Fix exception thrown when attempting to notify appservice `sender` of new messages.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the bug has only shown up on develop and isn't affecting a release, then the changelog should be the same as the PR that introduced the bug.

Suggested change
Fix exception thrown when attempting to notify appservice `sender` of new messages.
Fix a long-standing bug where rebuilding the user directory wouldn't exclude support and disabled users.

@@ -383,6 +383,10 @@ async def should_include_local_user_in_dir(self, user: str) -> bool:
"""Certain classes of local user are omitted from the user directory.
Is this user one of them?
"""
# The main app service sender isn't usually contactable, so exclude them
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand this comment. What's the "main app service sender"? And why isn't it usually contactable? I initially thought it was the sender_localpart of the application service but some bridges (e.g. the WhatsApp one) can be configured through a DM with that user, which doesn't fit into what I would consider not contactable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated the comment to explain what the "appservice sender" is and the nuance around some bridges being contactable where others are not and our decision to opt to exclude them for now.

# We're opting to exclude the appservice sender (user defined by the
# `sender_localpart` in the appservice registration) even though
# technically it could be DM-able. In the future, this could potentially
# be configurable per-appservice whether the appservice sender can be
# contacted.

It's pretty hard to define this user succinctly, or at all and have it be clear and obvious. Any ideas?

This is the single user defined by sender_localpart but it gets converted into a full MXID and set as the sender on an ApplicationService,

localpart = as_info["sender_localpart"]
if urlparse.quote(localpart) != localpart:
raise ValueError("sender_localpart needs characters which are not URL encoded.")
user = UserID(localpart, hostname)
user_id = user.to_string()

return ApplicationService(
token=as_info["as_token"],
hostname=hostname,
url=as_info["url"],
namespaces=as_info["namespaces"],
hs_token=as_info["hs_token"],
sender=user_id,


An "application service user" is different as that is any MXID that matches the exclusive regexes in the application service registration file.

tests/handlers/test_user_directory.py Outdated Show resolved Hide resolved
@MadLittleMods MadLittleMods changed the title Fix exception thrown when attempting to notify appservice sender Fix exception thrown when attempting to add appservice sender to user_directory Oct 8, 2021
@MadLittleMods MadLittleMods changed the title Fix exception thrown when attempting to add appservice sender to user_directory Fix exception thrown when attempting to adding an appservice sender to user_directory Oct 8, 2021
@MadLittleMods MadLittleMods changed the title Fix exception thrown when attempting to adding an appservice sender to user_directory Fix exception thrown when attempting to add an appservice sender to user_directory Oct 9, 2021
Copy link
Contributor

@DMRobertson DMRobertson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for umming and aahing on this one. It's been tricky to wrap my head around.

I think we should go ahead with these changes and plan to improve the situation in the future for app services.

I'd like to see some tweaks to the test coverage. I think we should ensure that the appservice sender created by the test doesn't match the appservice's regex (to make sure the test hits the new code path that @MadLittleMods added). I think I'd also like to see a test analogous to test_excludes_appservices_user which joins the appservice sender to a room.

I think it might be easiest if I give that a go myself and chuck it on the end of this PR. I hope that's okay @MadLittleMods. And to reiterate, I'm very grateful to you for spotting this in the first place!

I don't think this actually improves coverage as such, but I'll sleep
better at night having this case checked!
@babolivier babolivier added the X-Release-Blocker Must be resolved before making a release label Oct 11, 2021
David Robertson added 2 commits October 11, 2021 18:20
That's the current behaviour (I didn't realise they typically don't
match their own user regex).
@DMRobertson
Copy link
Contributor

Arg. Two tests are failing. One in the handler, for reasons I don't understand yet. A second in the dir rebuild code. That one fails because I didn't update the test after I tried to make appservice users not excluded.

... But unfortunately they're not fully included either, because I made the room processing step only add user_directory entries for remote users. I thought that was safer, clearer, and more robust in terms of avoiding per-room name leaks. But appservice senders don't have entries in the users table, so they're missed.

I can add another background process to explicitly add AS senders to the directory. I don't like adding this special case here but... I think I prefer that to muddling the already scary logic of the room and user rebuild steps.

I admit it's a niche edge case too. (E.g. if an appservice is configured but belongs to 0 rooms and "search all users" is configured on, I'd expect to be able to search for the appservice in the directory.)

@DMRobertson
Copy link
Contributor

This one got a bit confused when I was trying to get this done last night. I took it forward on #11053 . Sorry for muddling things here.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
A-Application-Service Related to AS support T-Defect Bugs, crashes, hangs, security vulnerabilities, or other reported issues. X-Regression Something broke which worked on a previous release X-Release-Blocker Must be resolved before making a release
Projects
None yet
Development

Successfully merging this pull request may close these issues.

user_directory.notify_new_event throws exception whenever interacting with application service sender
3 participants