Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tag Feature RFC #4

Merged
merged 23 commits into from
Jul 21, 2024
Merged

Tag Feature RFC #4

merged 23 commits into from
Jul 21, 2024

Conversation

Neshura87
Copy link
Contributor

No description provided.

0000-template.md Outdated Show resolved Hide resolved
@phiresky
Copy link
Collaborator

phiresky commented Nov 6, 2023

Rendered Version

@phiresky
Copy link
Collaborator

phiresky commented Nov 6, 2023

I don't have time to read too detailed right now, but from a quick glance:

  • Nice, I like it.

  • I don't understand exactly what the URL in the tag ID means, it seems there's three separate concerns:

    1. Identifying which scope the tag applies to: Usually this might either be the community the post is in, or the instance the post is viewed on
    2. Identifying the tag definition: If the tag is supposed to result in standard behaviour federation-wide (e.g. hiding NSFW), then the tag should have a meaning defined either globally or by some (remote) instance.
    3. Identifying who's responsible for moderating a tag

    So basically I'd expect there to be two separate URLs, one to identify the tag and one to identify the scope. e.g.

    { "tag": "lemmy-global-tags:nsfl", scope: "https://some-instance.org/c/post-community" } for a NSFL tag that is defined for all of lemmy and

    { "tag": "https://some-instance.org/t/season-5-spoiler", "scope": "https://some-instance.org/" } for a spoiler tag that's defined by some-instance and only valid on some-instance.

    In general, this is connected to how tags should be federated, which I don't think is fully described in the RFC?

  • I think there needs to be a more exact description of what tags are federated. It seems like tags with scope community should be federated everywhere, and with scope instance shouldn't. But what about remote instances adding tags? Seems almost like you should be able to subscribe to tags just like you can subscribe to communities if you want to show remote tags.

  • I'm not sure if this should be out of scope, but I feel like tags should follow a key=value model. So a tag should not just be nsfw, spoiler, fake-news but optionally also have values, like nsfw=gore, spoiler=Game of Thrones Season 5, fake-news=https://source-to-prove-it.

# Reference-level explanation

### Protocol:
According to https://www.w3.org/TR/activitystreams-vocabulary/#dfn-tags a general tag object exists in the ActivityPub protocol.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not exactly accurate, reading the spec it says that a tag property, containing an Object list is part of all Objects. This means that we need to define our own Tag Object type(s). So the tags would look more like this:

{
  "type": "Post",
  "tags": [
    {
      "@type": "https://json-ld.lemmy.ml/CommunityTag",
      "id": "https://lemmy.world/t/FooTag",
      "name": "FooTag"
    }
  ],
  ...
}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Meaning: The exact properties the tag object should have should probably be part of this RFC. And also specifying if there's multiple different tag types or if they (community+instance tags) have the same interface

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just for reference, here's how mastodon uses hashtags:
curl -H 'Accept: application/activity+json' https://mastodon.social/@nixCraft/111361899987606218

  "tag": [
    {
      "type": "as:Hashtag",
      "href": "https://mastodon.social/tags/linux",
      "name": "#linux"
    }
  ],

Using the non-standardized ActivityStreams:HashTag type (w3c/activitypub#235)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tag system for Lemmy seems fundamentally different from hashtags in Mastodon, so I think its reasonable to federate them in a way thats intentionally incompatible.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree - I just put it as a comparison. We should create our own object type.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Nutomic What is the fundamental difference between tags described in this proposal and hashtags used by microblogging services like Mastodon? I've read the proposal and there are many similarities.


btw, the URL in a highlighted section is not correct. It should be https://www.w3.org/TR/activitystreams-vocabulary/#dfn-tag (not #dfn-tags)

Copy link

@wont-work wont-work Nov 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If Lemmy federates a separate custom-built Tag object instead of Hashtags it has the ability to edit or delete tags without having to send a few billion Updates to individually edit all posts that have said tags attached to them. (Instead it can just Update the individual tag)

This would also allow for a protocol for cross-instance tag moderation to be built, e.g. using Add/Tag Undo/Add/Tag (Accept/Add/Tag and Reject/Add/Tag for tag suggestions?) which AFAIK is impossible with the current design of hashtags (and wouldn't federate to existing implementations anyway)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the fundamental difference between tags described in this proposal and hashtags used by microblogging services like Mastodon? I've read the proposal and there are many similarities.

Mostly Lemmy tags being static as opposed to dynamically user created. The Lemmy tags exist in a database table and can be mass edited if needed since they will be referenced by id. Plus the general use case for tags on Lemmy seems less a "just randomly tagging this so people can find it elsewhere" and more a "this community has broad interests so I'll specify what's roughly inside this post"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would also allow for a protocol for cross-instance tag moderation to be built, e.g. using Add/Tag Undo/Add/Tag (Accept/Add/Tag and Reject/Add/Tag for tag suggestions?) which AFAIK is impossible with the current design of hashtags (and wouldn't federate to existing implementations anyway)

Should be possible, changes to tags would still need to be federated but the posts themselves should be able to stay untouched. Simply changing the values in the tag table should "edit" all existing posts automatically.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not exactly accurate, reading the spec it says that a tag property, containing an Object list is part of all Objects. This means that we need to define our own Tag Object type(s). So the tags would look more like this:

this seems to be correct, I've added the relevant part to the RFC with b6afe07. I think the initial tag object can be kept simple, it's easier to add things later on than remove them imo.

0004-post-tags.md Outdated Show resolved Hide resolved
0004-post-tags.md Outdated Show resolved Hide resolved
@Nutomic
Copy link
Member

Nutomic commented Nov 6, 2023

In general its best to implement new Lemmy features from the bottom up, meaning start with the database schema changes, then make necessary changes to api structs and the api itself, finally going to the frontend. Federation can be considered once these other parts are clear.

@phiresky
Copy link
Collaborator

phiresky commented Nov 6, 2023

Federation can be considered once these other parts are clear.

I disagree - how tags are federated is a core part of how they work and changes the DB design as well - because it basically describes who the audience of the tag is and influences properties it needs. If you want to reduce the scope to community-post tags and say that the audience of the tag is the same as the audience of the community, then that answers the question of federation. But it does need to be answered IMO. Otherwise it's not clear what a tag is even supposed to be


Tags grant you more fine grained control over the content you see on Lemmy. Certain users do not want to see certain content and tags allow you to do exactly that.

For example a lot of communities offer mixed content such as Memes, News and Discussions. Some people would like to only see News or Discussions. In such a case they could blacklist (filter out) the "Meme" tag, going forward they would not be presented with any Memes in that community.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You'd need to clarify then, who is creating and applying these tags. Because if its the posters / users themselves, rather than the community moderators, then it won't help you with filtering.

If the tags are completely unmoderated, and anyone can tag something with any tag they want, it would be far worse for that than subscribing or blocking as it exists already.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A potential "curator" moderation level whose entire purpose is to adjust tags could be introduced, though you could definitely make the argument that this is overkill for most kinds of communities.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Intended curation for the start is that Admins/Moderators create these tags, mostly because Lemmy's current permission system doesn't allow for much anything else without significant work. Clarified the language in the section a bit (see d894b35).

0004-post-tags.md Outdated Show resolved Hide resolved
@wont-work
Copy link

wont-work commented Nov 22, 2023

re phiresky:

  • I'm not sure if this should be out of scope, but I feel like tags should follow a key=value model. So a tag should not just be nsfw, spoiler, fake-news but optionally also have values, like nsfw=gore, spoiler=Game of Thrones Season 5, fake-news=https://source-to-prove-it.

If I may offer a suggestion: Make tags hierarchical

This doesn't need to be too complex. Just make all tags internally end with a dot (or some other separator), and make all tag matches into prefix matches. Suddenly you have the ability to check for nsfw, which would be a prefix search for nsfw. that ends up matching nsfw.gore, nsfw.porn.real nsfw.porn.artwork.2d, or a tag like spoiler.game of thrones which would match all spoiler.game of thrones.seasonX tags (which would match all spoiler.game of thrones.seasonX.episodeY tags).

Whether you want to show or hide results is up to what you want to do ;)


Now if you do want to make it complex, add the ability to add multiple aliases to one "logical" tag. This not only clears up the ugliness that may end up from above hierarchical tags, but it also enables extra functionality such as "tag implications". where creating a tag with a long hierarchical name and aliasing it to a shorter name would make it imply all the parent tags in the hierarchy and getting included in filters related to that tag.

It would also help with federation, as if multiple instances or communities have different names for the same tag, they could all be aliased to each other (whichever one ends up being "the canonical one" will likely be a per-instance/community decision).

You may have to maintain a separate search index table in order to pull this off, I have some prototype Postgres/Python code that implements such a system for an intended-to-be-federated thing that ended up going nowhere due to concerns relating to federated tag moderation.

A flexible tagging system of this nature could help with all sorts of problems, in particular: Instead of having tiny communities for extremely niche stuff, communities could be much broader in topic, and individual posts can be filtered down via tags.

This would also make Lemmy (perhaps with a purpose-built UI) a viable base to build a federated alternative to art focused platforms such as Pixiv or various "*boorus"

- Instance Tags: `https://example.org/t/tag`
- Community Tags: `https://example.org/c/instance/t/tag`

The tag URL can then be utilized as a unique identifier for the tag, populating the `id` field of the tag object. Using the object ID optional tag federation can also be achieved, allowing for communities across multiple instances to share content via tags (example: News tag shared across instances). This would also solve the issue of splintered communities across instances while not forcing it on the communities in question. For now tags will only be applicable to posts, however the general design allows for them to be attached to any kind of object later on, be that instance, community or user. The limited initial scope allows for easier modifications should any rough edges or missing features be discovered.
Copy link

@wont-work wont-work Nov 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tag URL can then be utilized as a unique identifier for the tag

...and you just gave up the ability to rename tags without having to individually un-tag and re-tag all posts. The ID of a tag should be totally opaque, either some kind of UUID or an autoincrement number like current post/comment IDs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mixed up the lingo in the RFC, if you check later down I mention tag id's. Changed the language anyway because what is written in your excerpt clearly contradicts what I wanted to write and what I wrote a few paragraphs below it.

- Generic Tag

Theoretically NSFW could be implemented using a preset "Content Warning" tag but seperating out this tag allows instances to better filter it out for moderation purposes (for example if no admin/moderator is willing to moderate NSFW content).
Both "NSFW" and "Content Warning" tags should blur the post body by default. Additionally the `sensitive` post flag to `true` should either of these types be present in the post tags to ensure correct handling on other fediverse platforms.
Copy link

@wont-work wont-work Nov 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be made slightly more flexible by having a "tag categories", which itself could then be marked as sensitive or "should be behind a cw". You could also move styling and sorting data into the category object to have tags that are colored consistently and sorted in a predictable way.

Of course, not sure how necessary this separation would be, just throwing an idea out there.


# Unresolved questions

- How would tags federate/display on other Fediverse services such as Mastodon?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While it'd definitely be useful to federate tags as hashtags and such, I don't think Lemmy should restrict itself to the limitations of hashtags. Instead, consider creating an AP extension (or adopting an already existing extension if one already exists) that:

  • Allows editing of the tags themselves (renaming or otherwise)
  • Has the ability to adjust tags across instance boundaries (potentially via Accept/ Reject/ so implementations can create some sort of "tag suggestion" UI?)
  • That does not have anything to do with communities (remember: not all fediverse software need that concept!)
    • The concept of "instance-level tags" and "community-level tags" should be a concept enforced by the Lemmy backend and not ActivityPub itself

This would make it so that any other fediverse projects that want a similar level of flexibility in tags can just adopt Lemmy's extension and get interoperability "for free", even if they're built for different audiences.

Perhaps some tags could be marked as "hashtaggable", but I don't think all tags should federate as hashtags. a tag such as Spoiler does not need a hashtag. (in fact, a CW would be more appropriate!), not to mention the more hashtags you add the more Update activities you need to broadcast per tag change.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may be worth writing an FEP about this.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I saw several discussions around content labeling, and Web Annotation Protocol is often brought up. It might be a good foundation for such FEP. No need to adopt the protocol as is, but I think their vocabulary can be useful.

@mattaylor
Copy link

mattaylor commented Feb 9, 2024

+1 for heirachical tags, with user freindly aliases.
Hierarchical tags can be efficiently implemented in postgres using ltree types. This also opens the opportunity to moderate both the creation of child tags as well as how the tags may be applied according to a policy associated with the parent tag, with two boolean flags:

  1. restricted => Can the tag be used and usable by anyone or only the tag moderators.
  2. extensible => Can child tags be created by anyone or just the tag moderators.

Root level tags could then be limited to
A) communities hashtags which would be managed by community moderators (or left open to all)
(eg group.politics.usa)
B) instance content moderation flags managed by instance owners (eg flag.nsfw, flag.spoiler, flag.disputed, flag.offensive).
C) emojis or reaction tags that can federated accross instances (eg emoji.wow emoji.love, emoji.sad etc..

@wont-work
Copy link

wont-work commented Feb 10, 2024

Hierarchical tags can be efficiently implemented in postgres using ltree types.

well that would've been good to know months ago :p

C) emojis or reaction tags that can federated accross instances (eg emoji.wow emoji.love, emoji.sad etc..

-1 to this. if emoji reactions are wanted by upstream (i like them personally) they should be implemented the standard way the rest of the fedi (Pleroma, Akkoma, Misskey, Chuckya and soon-to-be Glitch) do, via the EmojiReact activity (with incoming Like-with-content support to handle Misskey reactions as well)

this isn't relevant to tags but let's use existing conventions here. otherwise we get things like lemmy's custom emoji handling which i'm considering completely broken and unusable (inline markdown, seriously?)

Copy link
Collaborator

@phiresky phiresky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at this again, there's a lot of unclarity still and a few things that aren't possible. Here's a set of changes that clarify those things and align with how AP works.

0004-post-tags.md Outdated Show resolved Hide resolved
0004-post-tags.md Outdated Show resolved Hide resolved
0004-post-tags.md Outdated Show resolved Hide resolved
0004-post-tags.md Outdated Show resolved Hide resolved
0004-post-tags.md Outdated Show resolved Hide resolved
0004-post-tags.md Outdated Show resolved Hide resolved
0004-post-tags.md Outdated Show resolved Hide resolved
0004-post-tags.md Outdated Show resolved Hide resolved
0004-post-tags.md Outdated Show resolved Hide resolved
0004-post-tags.md Show resolved Hide resolved
Neshura87 and others added 11 commits July 15, 2024 15:50
Thanks to phiresky for providing the updated language

Co-authored-by: phiresky <phireskyde+gh@gmail.com>
Co-authored-by: phiresky <phireskyde+gh@gmail.com>
Co-authored-by: phiresky <phireskyde+gh@gmail.com>
…ag Object structure


Wording by phiresky

Co-authored-by: phiresky <phireskyde+gh@gmail.com>
…cture


Wording by phiresky

Co-authored-by: phiresky <phireskyde+gh@gmail.com>
Wording by phiresky

Co-authored-by: phiresky <phireskyde+gh@gmail.com>
Wording by phiresky

Co-authored-by: phiresky <phireskyde+gh@gmail.com>
Changes proposed by phiresky

Co-authored-by: phiresky <phireskyde+gh@gmail.com>
Co-authored-by: phiresky <phireskyde+gh@gmail.com>
0004-post-tags.md Outdated Show resolved Hide resolved
0004-post-tags.md Outdated Show resolved Hide resolved
0004-post-tags.md Outdated Show resolved Hide resolved
Neshura87 and others added 2 commits July 16, 2024 13:19
Copy link
Member

@dessalines dessalines left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Language looks fine, and I like these being on a per-community basis, therefore created by community mods and specific to communities.

Two things:

  • Check your # headers, sometimes you're skipping one.
  • Install prettier on your system, and run prettier -w THIS_FILE , to format it. If you don't know how to do this, lmk and I can do it for you.

0004-post-tags.md Outdated Show resolved Hide resolved
Copy link
Collaborator

@phiresky phiresky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Copy link
Member

@dessalines dessalines left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thx!

@dessalines dessalines merged commit 539ec87 into LemmyNet:main Jul 21, 2024
@ThisIsMissEm
Copy link

Hi hi, I know this is already merged and work is underway, I just wanted to mention this AP lssue which might overlap, after @silverpill raised my attention to this effort:

w3c/activitystreams#583

@phiresky
Copy link
Collaborator

fyi, just because this was merged does not mean anyone is working on it, it only means if someone works on it this is what they should look at. Afaik no one is working on it currently.

@Neshura87
Copy link
Contributor Author

Neshura87 commented Jul 24, 2024

fyi, just because this was merged does not mean anyone is working on it, it only means if someone works on it this is what they should look at. Afaik no one is working on it currently.

I'll probably take up the work once my free time permits it. I'm certainly not the best programmer but with a written down spec I'm willing to bash my head against it until it passes review. On the backend at least.

@phiresky
Copy link
Collaborator

Neat, put updates somewhere publicly (early draft PR etc) so there's no double work with someone else

@phiresky
Copy link
Collaborator

phiresky commented Aug 5, 2024

hello @Neshura87 , please let me know whether you have started anything in this regard, since I might start implementing this in the next few days

@Neshura87
Copy link
Contributor Author

Hello @phiresky no I haven't started on this yet and given my current outlook it's going to be ~2 weeks before I have enough room in my schedule to start work on this. Do feel free to implement it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants