Proposing updates based on community feedback #91

krgovind · 2022-07-27T16:03:27Z

No description provided.

Proposing changes to First-Party Sets based on community feedback.

dmarti

Positive changes here, but I'm concerned about a possibly subjective and time-consuming review process for participants in public reviews.

dmarti · 2022-07-27T20:55:22Z

README.md

+    </tr>
+    <tr>
+      <td>service</td>
+      <td>Reserved for utility or sandbox domains.<br>


This is likely to get tested by people setting up a domain to look like a service domain, and then switching it to do something different. Are there any other limitations on a "utility or sandbox" domain that would keep it functional for those purposes, but would create less temptation to switch it to something else (for example a service domain could not have third-party resources from a domain that is not a member of its set?)

Enforcement is going to be more of a challenge for this one than for the first two, since the first two are much easier to enforce automatically.

Is a utility or sandbox domain always a domain that is not intended to appear in search results? Another possible rule here would be to require that service domains block general-interest web search engines in their robots.txt.

Yes, I think that this is a great approach to service/utility/sandbox domains, where we can make a few strong assumptions along the lines of "this is not a self-sufficient site that users would discover and interact with, except when integrating with other sites in the set", then build hard technical constraints that match these, like requiring of blocking crawling.

This is something we should work with developers to figure out, to understand potential use cases of "service" domains and which constraints would break valid use cases. I think this is partially tracked in #95 and we might have time to discuss it on the WICG call.

dmarti · 2022-07-27T21:09:36Z

README.md

+    <tr>
+      <td>associated</td>
+      <td>Reserved for domains whose affiliation with the set primary is clearly presented to users (e.g., an About page, header or footer, shared branding or logo, or similar forms).</td>
+      <td>Limit of 3* domains. If greater than 3, auto-reject access.<br>


An additional rule might make this practical: a pull request that adds an "associated" set must include user research to substantiate that the set is clearly presented to users.

research should cover a representative sample of the audience for the sites that are members of the proposed set

research should document the understanding on the part of the users of the relationship among site members

results should be statistically significant

researchers should be available to answer questions on the PR.

Thanks for the suggestion, Don! A few personal thoughts from my side:

I agree with your concern about long discussions/pile-ons with subjective opinions of what constitutes "clearly presented" and we should try to avoid that.

However, I feel like asking for dedicated user research is adding significant cost to submission. It's a barrier to smaller sites and sites from non-English speaking countries, and even mid-sized organizations may struggle coordinating this kind of project.

Conversely, if an attacker falsely claims two domains have a clear affiliation, faking or manipulating user research in order to pass doesn't seem unthinkable. So we're setting up the enforcement entity/community to the challenge of disproving the presented results, which they would always strive to do if the association is not subjectively clear to them. As such, "clear presentation" seems like it would end up to be the main criterion either way?

In the end, I'm not sure that user research would actually sway most people's subjective opinion about whether two domains are clearly affiliated.

Given that, it seems a bit excessive to ask of developers by default, though I think it's interesting to consider as an optional step.

A problem with having the public reviewers directly review the set is that they're not necessarily going to get the same experience as the real users who are using the set member sites. A set could do some pages that are "clearly presented" (home page, privacy policy) for public review, and then promote other pages that are not clearly presented or easily findable. (For example, a clinic site and an online pharmacy could make their home pages clearly common set members, then use social media ads to promote "independent doctor recommendations" and "product info" pages that are designed to appear independent).

FPS is an exchange of value for value -- the set members get additional information from users, and the users get additional checks on the practices and reputations of the set members. Set members are asking for something from users, so should offer something in exchange -- research that reflects people's real experience with the set members. And the research does not have to be a separate project -- it can be an expansion of the user studies that set members are already doing.

There is a risk that a user researcher could present a completely bogus study, but each researcher would only be able to get away with that once. New and unknown researchers would receive additional attention.

I think #101 is somewhat picking up on this topic, so maybe we can continue to discuss over there.

dmarti · 2022-07-27T21:13:12Z

README.md

+
+Additionally, there are other enforcement strategies we could consider to further mitigate abuse. If there is a report regarding a domain specified under the "service" subset, potential reactive enforcement measures could be taken to validate that the domain in question is indeed a "service" subset.
+
+For some subsets, like the "associated" subset, objective enforcement may be much more difficult and complex. In these situations, the browser's handling policy, such as a limit of three domains, should limit the scope of potential abuse. Additionally, we think that site authors will be beholden to the subset definition and avoid intentional miscategorization as their submissions would be entirely public and constitute an assertion


One likely risk here is that the repository gets filled with long discussions of people's subjective evaluation of whether a common context or branding is "clearly presented." Limiting the number of domains would reduce the payoff for winning one argument about whether a set is clear to the user, but could increase the number of such arguments.

Graphic design conventions, and of course text about a site's set membership, can also be different across languages and cultures. A rule based on posting user research to the repository (see my comment on line 252) could help to limit the number and length of arguments about "clear" sets.

As mentioned, if the primary idea for user research is to discourage discussion, maybe it makes more sense specifically for organizations that expect such discussion to occur to prepare such research, and not for smaller sites or sites that are confident that their use case is clear enough that it won't come under public scrutiny.

dmarti · 2022-07-27T21:18:41Z

README.md

+
+### Abuse mitigation measures
+
+We consider using a public submission process (like a GitHub repository) to be a valuable approach because it facilitates our goal to keep all set submissions public and submitters accountable to users, civil society, and the broader web ecosystem. For example, a mechanism to report potentially invalid sets may be provisioned. We expect public accountability to be a significant deterrent for intentionally miscategorized subsets.


Public accountability will be valuable as long as the number of sets requiring review is manageable and the review work is seen as meaningful. It would be useful to limit the number of proposed sets requiring human attention, especially for the "associated" type, by requiring associated research and through setting sensible waiting periods for re-submitting the same domain in a new set after a rejection of a previous one.

It is always important to consider the incentives for participating in public accountability. People will be less likely to participate if their efforts are seen as duplicating a task that could have been automated, or as joining an open-ended argument about whether a set is valid. There is likely to be more and higher-quality participation if it is presented as evaluating the existing user research about a set than if it is just an opportunity to express opinions.

There will also need to be appropriately designed and enforced anti-abuse and anti-harassment measures for public review participants. (nobody wants to get SWATted because their comment kept some scammer's bogus set from getting in)

I think this is an important point to make, besides your point about user research evaluation, we should think about how to ensure that all humans involved in the public process are appropriately protected and feel like their time is spent well. I can file an issue to that effect!

dmarti · 2022-07-27T21:25:36Z

README.md

+      <td>Reserved for domains whose affiliation with the set primary is clearly presented to users (e.g., an About page, header or footer, shared branding or logo, or similar forms).</td>
+      <td>Limit of 3* domains. If greater than 3, auto-reject access.<br>
+<br>
+<em>*[^1]exact number TBD</em></a></td>


Possibly more effective than a hard limit would be setting the delay for re-submitting a domain from a rejected set into a new set based on the number of members of the rejected set. (Forming a larger set would mean taking more risks of having the member domains kept out of future sets for longer, so create more incentives to do research that is more likely to hold up)

Sorry, catching up to this a bit late. I think this suggestion adds unnecessarily high stakes to the review process. Just because of the set structure (which is a technical detail), a rejection/removal could have catastrophic consequences for "legitimate" set owners. It seems very arbitrary and not like a great submission experience.

Thanks for the review, Don! The reasoning for the hard limit is based on previous discussions that suggests it is harder to ensure user understanding of associated sites for large sets; and it is difficult to run technical checks or other enforcement at scale to ensure user understanding. So the numeric limit is aimed to reducing the scope of abuse. Enforcing a delay for re-submitting a domain from a rejected set into a new set doesn't address the user understanding issue.

I'm planning to merge in this PR; but we can continue the discussion on #93

Thank you--I also opened #101 to cover an example of a possible method of abuse that could be done with a small set.

johannhof

Given that we're now mostly discussing the how of the new proposal and related details, I think it's reasonable to merge this PR to reflect the most recent state of FPS in the README.

We can update the README later as we refine things further.

krgovind added 3 commits July 27, 2022 11:51

Updating proposal with significant changes

6b66f14

Proposing changes to First-Party Sets based on community feedback.

Update TOC

3f673e0

Archive UA Policy Proposal

01825b8

krgovind mentioned this pull request Jul 27, 2022

Proposing changes to First-Party Sets based on community feedback #92

Closed

dmarti reviewed Jul 27, 2022

View reviewed changes

johannhof approved these changes Aug 22, 2022

View reviewed changes

johannhof mentioned this pull request Aug 22, 2022

Empower and protect people that are involved in the set submission / review process #105

Open

krgovind merged commit b9bef14 into WICG:main Aug 22, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Proposing updates based on community feedback #91

Proposing updates based on community feedback #91

krgovind commented Jul 27, 2022

dmarti left a comment •

edited

Loading

dmarti Jul 27, 2022

dmarti Aug 2, 2022

johannhof Aug 15, 2022

dmarti Jul 27, 2022

johannhof Aug 15, 2022

dmarti Aug 15, 2022

johannhof Aug 22, 2022

dmarti Jul 27, 2022

johannhof Aug 15, 2022

dmarti Jul 27, 2022

johannhof Aug 22, 2022

johannhof Aug 22, 2022

dmarti Jul 27, 2022

johannhof Aug 22, 2022

krgovind Aug 22, 2022

krgovind Aug 22, 2022

dmarti Aug 23, 2022

johannhof left a comment


		Additionally, there are other enforcement strategies we could consider to further mitigate abuse. If there is a report regarding a domain specified under the "service" subset, potential reactive enforcement measures could be taken to validate that the domain in question is indeed a "service" subset.

		For some subsets, like the "associated" subset, objective enforcement may be much more difficult and complex. In these situations, the browser's handling policy, such as a limit of three domains, should limit the scope of potential abuse. Additionally, we think that site authors will be beholden to the subset definition and avoid intentional miscategorization as their submissions would be entirely public and constitute an assertion


		### Abuse mitigation measures

		We consider using a public submission process (like a GitHub repository) to be a valuable approach because it facilitates our goal to keep all set submissions public and submitters accountable to users, civil society, and the broader web ecosystem. For example, a mechanism to report potentially invalid sets may be provisioned. We expect public accountability to be a significant deterrent for intentionally miscategorized subsets.

Proposing updates based on community feedback #91

Proposing updates based on community feedback #91

Conversation

krgovind commented Jul 27, 2022

dmarti left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

johannhof left a comment

Choose a reason for hiding this comment

dmarti left a comment •

edited

Loading