Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MSC2346: Bridge information state event #2346

Open
wants to merge 23 commits into
base: old_master
Choose a base branch
from
Open
Changes from 13 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
219 changes: 219 additions & 0 deletions proposals/2346-bridge-info-state-event.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,219 @@
# MSC 2346: Bridge information state event
Half-Shot marked this conversation as resolved.
Show resolved Hide resolved

Many rooms across the Matrix network are currently bridged into third party networks, using bridges.
However the spec does not contain a cross-federated method to determine which networks are
bridged into a given room.

There exists a way to do this in a local setting, by using the [/thirdparty/location](https://matrix.org/docs/spec/application_service/r0.1.2#get-matrix-app-v1-thirdparty-protocol-protocol) API but this creates a splitbrain view across the
Half-Shot marked this conversation as resolved.
Show resolved Hide resolved
federation and is an unnacceptable situation.

Many users have taken to peeking at the list of aliases for a giveaway alias like `#freenode_` or
looking for bridge bots or users with a `@_discord_` prefix. This is an unacceptable situation,
as it assumes prior knowledge of these networks and an understanding of how bridges operate.

## Proposal

This proposal attempts to address this problem by providing a single state event for each bridge in a room
to announce which channels have been bridged into a room.

It should be noted that this MSC is intended to provide the baseline needed to display information about
a bridge, and nothing more. See the "Future MSCs" section for more information.

This proposal is heavily based upon my previous attempt [#1410](https://github.com/matrix-org/matrix-doc/issues/1410)
albeit with a notably reduced set of features. The aim of this proposal is to offer information about the
bridged network and nothing more.

### `m.bridge`

```js
{
"state_key": "org.matrix.appservice-irc://{protocol.id}/{network.id}/{channel.id}",
Half-Shot marked this conversation as resolved.
Show resolved Hide resolved
turt2live marked this conversation as resolved.
Show resolved Hide resolved
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What should be the format of the state key if network ID/channel ID are omitted? e.g. foo.bar.appservice-skype://skype//someid or foo.bar.appservice-skype://skype/someid? (do we just drop the component and keep all the slashes, or do we collapse slashes?)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I probably makes sense for the format of what comes after :// to be dependent on the value before ://. Here {protocol.id}/{network.id}/{channel.id} would be the format for org.matrix.appservice-irc://; but for foo.bar.appservice-skype:// it would always be {protocol.id}/{chat.id}

"type": "m.bridge",
"content": {
"creator": "@alice:matrix.org", // Optional
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need this? Widgets already suffer from this being unreliable and unhelpful, to the point of us ignoring it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to the point of us ignoring it.

image

How's that going for you?

This is actually the same use case that widgets are using it for right now, it's just sugar to point at whoever added the bridge, if it was added by a user.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, you may have been referring to it duplicating the sender field. This is intentional, for the plumbing use case. I don't see any reason why this would be unreliable though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not really trusted information, though I guess if somehow the bridge is limited to being able to send it then it can be trusted. The reason we don't use creatorUserId for widgets, even if someone else edits the widget, is because it is displayed so prominently and can cause lies to be shown to the user.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The major concern here is essentially a room admin/mod hopping into the room and changing the state event to say that @bob:example.org is responsible for the bridge, who may or may not even be a real user ID. One of the downsides for Bob in that case would be spam about bridging to somewhere controversial. In a general scenario though, as an informational thing it's probably okay, but it would be nice to acknowledge the privacy/social risks in the security section.

"protocol": {
"id": "irc",
"displayname": "IRC", // Optional
Half-Shot marked this conversation as resolved.
Show resolved Hide resolved
"avatar": "mxc://foo/bar", // Optional
Half-Shot marked this conversation as resolved.
Show resolved Hide resolved
"external_url": "https://example.com" // Optional
Half-Shot marked this conversation as resolved.
Show resolved Hide resolved
},
"network": { // Optional
Half-Shot marked this conversation as resolved.
Show resolved Hide resolved
"id": "freenode",
"displayname": "Freenode", // Optional
Half-Shot marked this conversation as resolved.
Show resolved Hide resolved
turt2live marked this conversation as resolved.
Show resolved Hide resolved
"avatar": "mxc://foo/bar", // Optional
Half-Shot marked this conversation as resolved.
Show resolved Hide resolved
"external_url": "irc://chat.freenode.net" // Optional
},
"channel": {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Why are we calling this a channel? In Matrix we call these rooms so doesn't it make sense to use the local terminology. Basically "the thing in the source protocol which maps to a room in Matrix". For example if the remote network has groups and Matrix has rooms why are we calling the field channel?

"id": "#friends",
"displayname": "Friends", // Optional
"avatar": "mxc://foo/bar", // Optional
"external_url": "irc://chat.freenode.net/#friends" // Optional
},
// Custom vendor-specific keys
Half-Shot marked this conversation as resolved.
Show resolved Hide resolved
"org.matrix.appservice-irc.room_mode": "+sn",
},
"sender": "@appservice-irc:matrix.org"
}
```

The `state_key` must be comprised of the bridge's prefix, followed by the `protocol.id`, followed by the `network.id`, followed by the `channel.id`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

line wrap.

Also, what is the bridge's prefix? What is enforcing this?

Copy link
Contributor Author

@Half-Shot Half-Shot Feb 7, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The bridge prefix can be anything, but should uniquely
identify the bridge software. E.g. The matrix.org IRC bridge matrix-org/matrix-appservice-irc becomes org.matrix.appservice-irc. This is to help distinguish two bridges on different softwares which may conflict.

Is a pretty good description of what it is ?

What's enforcing it? Nothing. It's here to uniquely identify one piece of bridge software to ensure it doesn't conflict with another.

Any `/`s must be escaped into `%2F`. The bridge prefix can be anything, but should uniquely identify the bridge software
Half-Shot marked this conversation as resolved.
Show resolved Hide resolved
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we just link to an RFC about URL encoding?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't a URL. It's just a string separated by /s and the only character that requires escaping is the /.

that consumes the event. E.g. The matrix.org IRC bridge `matrix-org/matrix-appservice-irc` becomes `org.matrix.appservice-irc`.
This is to help distinguish two bridges on different softwares which may conflict.

The `sender` should be the MXID of the bridge bot.
richvdh marked this conversation as resolved.
Show resolved Hide resolved
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be better to have a dedicated field for the identity of the bridge bot rather than it having to be the sender. My main motivation for this is potential future uses of this type of event to request provisioning of bridges (i.e. I send this event into the room, invite the bridge bot user, the bridge reads the state event and sets up the bridge).

While I appreciate that this isn't really the scope of this MSC (which is fine), standardising on the only entity to send these state events is the bridge bot seems very limiting for future iterations and extensions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might also help solve #2346 (comment) as we don't assume the sender is the bridge.


The `creator` field is the name of the *user* which provisioned the bridge. In the case of alias based bridges, where the
creator is not known -- it may be omitted.

The `protocol` field describes the protocol that is being bridged. For example, it may be "IRC", "Slack", or "Discord". This
benparsons marked this conversation as resolved.
Show resolved Hide resolved
field does not describe the low level protocol the bridge is using to access the network, but a common user recongnisable
name.

The `network` field should be information about the specific network the bridge is connected to.
It's important to make the distinction here that this does *NOT* describe the protocol name, but the specific network
the user is on. For protocols that do not have the concept of a network, this field may be omitted.

The `channel` field should be information about the specific channel the room is connected to.

The `id` field is case-insensitive and should be lowercase.
Half-Shot marked this conversation as resolved.
Show resolved Hide resolved

The `network`, `channel` and `protocol` fields can contain `displayname` and `avatar` keys. The `displayname` is meant to be a human readable identifier for the item in question, whereas the ID should be a unique identifer relevant to the protocol. The `id` should be used in place of a `displayname`, if not given. The `avatar` key is a MXC URI which refers to an image file, similar to a user or room avatar.
Half-Shot marked this conversation as resolved.
Show resolved Hide resolved
Half-Shot marked this conversation as resolved.
Show resolved Hide resolved

The `external_url` key is a optional link to a connected channel, network or protocol that works in much the same way as
`external_url` works for bridged messages in the AS spec.

In terms of hierachy, the protocol can contain many networks, which can contain many channels.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we have examples of the protocol.id/network.id/channel.id part of the state key? e.g. for XMPP, would it be XMPP/xsf@muc.xmpp.org or XMPP//xsf@muc.xmpp.org (i.e. do we drop a / if there's no network field? What if there are more levels (e.g. protocol, network, community, room)? Is the network supposed to be human-readable like "Freenode" or something more like "freenode.org"? Maybe it would be better to just say that it's a path representing the hierarchy starting from the protocol and ending in the room? And then instead of hard-coding protocol, network, and channel as keys in the content, make it an array where the first element is the protocol and the last element is the channel?

Also, if a protocol/network name has a / in it, does it get escaped?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added several examples of the event body to the description that might help make this more understandable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And then instead of hard-coding protocol, network, and channel as keys in the content, make it an array where the first element is the protocol and the last element is the channel.

Potentially, though I am worried this might make it harder for clients to show a sensible UI. How would riot render:

["Github", "matrix-org", "matrix-doc", "2346"]
vs
{"protocol": "GitHub", "network": "matrix-org/matrix-doc", "channel": "2346"}

(Simplified for readability, and using a deliberately complex example).

In the first example, there are 4 keys and it's hard for a client to decide how to format this in a settings page. Joining them with a delimiter is too ugly (to me). There are probably examples which are restricted by the 3 component limit, but I am struggling to come up with any?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the client actually need to care about the state key? The information embedded into the state key is already available in the body, and clients would want to use a more friendly name anyways. I think we can just use an empty state key and let clients figure it out.

If you're running multiple bridges off the same bridge bot, don't.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you're running multiple bridges off the same bridge bot, don't.

Or if you are running several bridges off different bridge bots, you will still need different state events and therefore different keys.

Why can't we use user_ids? We could, but that does forever tie your bridge to using one user_id for life when the actual thing the bridge is "keyed" off is the protocol,network and channel.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The client doesn't need to give a heck about the state key, other than it being more readable for those who want to work on it. Given there isn't really a downside to having a schema for the state key, and it gives more readability, I don't see why not.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would much rather that we don't prescribe the state key. Just say that it needs to start with the bridge ID (unique prefix) and the rest is implementation defined. Any meaningful values should be delegated to the context of the state object.

The event may contain information specific to the bridge in question, such as the mode for the room in IRC. These keys
should be prefixed by the bridge's name. Clients may be capable of displaying this extra information and are free to do so.
Half-Shot marked this conversation as resolved.
Show resolved Hide resolved

### Example Content

#### XMPP

An example of a straight forward messaging bridge, such as the XMPP (bifrost) bridge:

```js
{
"state_key": "org.matrix.matrix-bifrost://xmpp/muc.xmpp.org/xsf@muc.xmpp.org",
"type": "m.bridge",
"content": {
"creator": "@alice:matrix.org",
"protocol": {
"id": "xmpp",
"displayname": "XMPP"
},
"network": {
"id": "muc.xmpp.org",
"displayname": "XSF",
"external_url": "xmpp:muc.xmpp.org"
},
"channel": {
"id": "xsf@muc.xmpp.org",
"displayname": "XSF Discussion",
"external_url": "xmpp:xsf@muc.xmpp.org"
}
},
"sender": "@xmpp:matrix.org"
}
```

#### GitHub

An example of a non-messaging bridge, such as the GitHub bridge:

```js
{
"state_key": "uk.half-shot.matrix-github://github/matrix-org%2Fmatrix-doc/2346",
"type": "m.bridge",
"content": {
"creator": "@alice:matrix.org",
"protocol": {
"id": "github",
"displayname": "GitHub"
},
"network": {
"id": "matrix-org/matrix-doc",
"external_url": "https://github.com/matrix-org/matrix-doc"
},
"channel": {
"id": "2346",
"displayname": "MSC2346: Bridge information state event",
"external_url": "https://github.com/matrix-org/matrix-doc/pull/2346"
},
"uk.half-shot.matrix-github.merged": false,
"uk.half-shot.matrix-github.opened_by": "Half-Shot",
},
"sender": "@github:matrix.org"
}
```

#### Mastodon feed

An example of a feed oriented bridge.

```js
{
"state_key": "org.matrix-org.matrix-mastodon://mastodon/mastodon.matrix.org/@matrix",
"type": "m.bridge",
"content": {
"creator": "@alice:matrix.org",
"protocol": {
"id": "mastodon",
"displayname": "Mastodon"
},
"network": {
"id": "mastodon.matrix.org",
"external_url": "https://mastodon.matrix.org"
},
"channel": {
"id": "@matrix",
"displayname": "Matrix.org",
"external_url": "https://mastodon.matrix.org/@matrix"
},
"org.matrix-org.matrix-mastodon.bio": "An open standard for decentralised persistent communication. Toots by @matthew, @Amandine & co.",
"org.matrix-org.matrix-mastodon.joined": "May 2017",
},
"sender": "@mastodon:matrix.org"
}
```

Note the `@` in this case helps distinguish the type of channel. Here the protocol used is "Mastodon" rather than "ActivityPub".
While the underlying protocol might indeed be ActivityPub, the choice of name should be recognisable to users.

## `/_matrix/app/v1/thirdparty/location`

This proposal does NOT seek to deprecate the `location` API even though this spec effectively supercedes it in most respects.
Half-Shot marked this conversation as resolved.
Show resolved Hide resolved
A future MSC may choose to remove it, however.

## Potential issues
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing: what if the bridge doesn't have permission to send state events? (a completely valid thing to do)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Such is life. I don't think we can engineer around users not giving their bridges permission to add state. It's not great, but it's better than adding lots of sketchy outside-of-the-room data(though matrix.org bridges in both portal and plumbed rooms have PL50 by default, so this is a relatively unlikely).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a lot more concerned with the bridges that aren't on matrix.org, given the popularity of bridges in the last year alone (and the year prior to this thread). Most bridges don't ask for permissions, and a quick poll shows that most non-matrix.org bridges appear to be running without appropriate permissions in those rooms.

This can be considered a room configuration error, but it's still a valid issue that this MSC needs to acknowledge. There's a point where we can't just write off issues as "users should be better".

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing: what if the bridge doesn't have a bridge bot? (puppet bridges, transparent bridges, etc)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, this is more of a "well, meh". There may be a way to solve this in a future MSC, but let's leave it for then.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure that's the best option: can we figure out a way to publish this info?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. Either a user the bridge has ownership over publishes the information, or a user publishes the information on behalf of the bridge. I don't think it's worth speccing this though, as it's down to implementation how they want to insert the event.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't have to be a state event. Could start publishing this over EDUs or some other DAG

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But having it in the room dag and a state event allows us to reuse pretty much all logic across the homeservers/clients/bridges. This is important, because we can not only reuse the same API routes when synchronizing the information across clients and bridges, but also re-use the same access control semantics as with other information in the room like names and topics (using PL events).

I would like to hear from others if they also think supporting the use case of a bridgebot-less bridge is important and requires us to invent our own non-room dag or EDU structure. Personally, it feels like a lot of faff for little gain.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A note in the MSC about how the non-bridge-bot user can publish this would be great.

Half-Shot marked this conversation as resolved.
Show resolved Hide resolved

The proposal intentionally sidesteps the 'bridge type' problem: Codifing what a portal, plumbed and gatewayed bridge look
like in Matrix. For the time being, the event will not contain information about the type of bridge in a room but merely
information about what is is connected to.

## Alternatives

Some thoughts have been thought on using the third party bridge routes in the AS api to get bridge info,
by calling a specalised endpoint. There are many issues with this, such as the routes not working presently
over federation, as well as requring the bridge to be online. Using a state event ensures the data is scoped
per room, and can be synchronised and updated over federation.

## Future MSCs
Half-Shot marked this conversation as resolved.
Show resolved Hide resolved

(This section is for the beneift of readers to understand why this MSC doesn't contain X feature)
Half-Shot marked this conversation as resolved.
Show resolved Hide resolved
Half-Shot marked this conversation as resolved.
Show resolved Hide resolved

This proposal forms the basis for bridges to become more interactive with clients as first class citizens rather
than relying upon users having prior knowledge about which users are bridged users, or where a room is bridged to.

Future MSCs could expand the /publicRooms response format to show what network a room is bridged to before the
user attempts to join it. Another potential MSC could allow users to see which bridges they are connected to
via an accounts settings page, rather than relying on PMs to the bridge bot.

## Security considerations

Anybody with the correct PLs to post state events will be able to manipulate a room by sending a bridge
event into a room, even if the bridge is not present or does not exist. It goes without saying that if
you let people modify your room state, you need to trust them not to mess around. A future MSC may allow
users to "trust" some mxids as bridges, rather than relying on just PLs to convey trustworthiness.