Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MSC2437: Store tagged events in Room Account Data #2437
base: old_master
Are you sure you want to change the base?
MSC2437: Store tagged events in Room Account Data #2437
Changes from 2 commits
11446d1
a52600f
f5d2479
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's worth being clear here that the words in the list are user-defined and not something the spec even intends to go into (I don't think it makes sense to spec some "pre-defined keywords").
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for using the same tagging scheme as other tags :D
Reusability ftw
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't it make more sense to attach
m.hidden
to the event itself, so that it is only fetched if the event is? There is no sense transferring a large number of hidden tags on every update.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All tags and all tag types are retrieved at once. For large numbers of tags it may make sense to load only specific types of tags or load them incrementally. For example if I have been using Matrix for a decade I don't want to repeatedly transfer 1000 "remembered" messages or similar. This could become a real issue with custom tags as the user may be using them to categorize incoming messages and the number of tags may grow very quickly.
I think that it is probably best to use a different storage system that can optimize common use cases.
O(1)
.O(1)
.n
tags of a given type should beO(n)
O(n)
.O(added+removed)
.Currently all of these are
O(total number of tags)
which seems unacceptable. The exception is syncing when no tags have been changed in which case it isO(1)
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some more ideas to address the race, somewhat applicable to account data in general:
Encode the event id in the account data type like
m.tagged_events.<event id>
. We don't currently have the habit of encoding ids in the type though and likely need to escape.
in the event id. We'd also need to extend theGET /account_data/
endpoint to accept a match by prefix rather than exact match for the type so you can listm.tagged_events.*
.Another way would be to do something akin to state events, add some
key
property which you can then specify when requestingGET /account_data/<type>/<key>
. Form.tagged_events
the key would be the event id. Form.direct
it could be the user id of the DM target.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A leftover comment from the previous review - there is an important difference between
m.direct
and this kind of data:m.direct
is defined in global account data, which greatly increases the race scope.m.direct
can be updated by running clients with no user interaction - e.g. when a DM is joined from another side, a client may setm.direct
at exactly the same time the inviter updatedm.direct
for another room - which make the race condition uncontrollable by the user.In contrast, the tagged events here are defined on the room level; and all changes to them originate from the user interaction (at least that's my impression from the MSC). This means that one user has to change the list on two different clients at once, which is certainly possible but not very practical; and the mitigation for that boils down to "just don't edit your tagged events on two phones at the same second".
This is why I think that the existing generic
account_data
API should usually be not too problematic for the purpose, and the robust list data management API is only a nice to have addition.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A guiding principle here is whether or not the schema is complicated enough to need an API on top of it. Something like pinned events doesn't get its own API because it's ~3 lines of code in most languages to do the operation, and the risk of overwriting other people's changes is largely a social problem and not a technical one (tell your friend to stop doing things so you can do it instead).
For this particular API, it's a complicated operation to add/remove things to the account data so a set of endpoints could be useful. The overlapping writes will still be a problem, but at least we can make it easier on client authors.
I'd suggest something like the following:
There would be an open question of whether or not we need an API to get all events tagged as a certain key, but I think the
/tagged_events
API proposed here would be fine enough for clients to locally filter.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The RESTful approach would bring something like:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think deleting by index is generally a good idea, exactly because it's very prone to races. Some kind of a locally-unique identifier would be much better here (and then, again pretty much in line with REST principles it could be
DELETE /.../account_data/<type>/<element_id>
.I personally like this approach more than devising API endpoints for particular account data types. In particular, it's very well extensible to custom account data (and it's not too much of a hassle to manage where these generic endpoints can/cannot be used in
m.
namespace). I believe that it would be a great subject for a separate MSC though.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also - can I have an unstable prefix for this?