-
Notifications
You must be signed in to change notification settings - Fork 77
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Convert "First Party Sets" to "GDPR Validated Sets" #86
base: main
Are you sure you want to change the base?
Conversation
- aligns to GDPR - introduce user consent - introduce two flexible methods to validate ownership - provide evidence to sanction bad actors should harm occur
We wish to allow user agent data sharing to span multiple domains and origins only when the following conditions are | ||
met; | ||
|
||
- where the legal ownership of the registerable domain can be established by the user; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ownership is difficult for users and browser maintainers to check, especially when the laws and/or language of the site owner are different from those of the user or browser maintainer. Previous discussion at:
- Replace owner/controller language with simpler language on controller (and define it) #56 (comment)
- https://github.com/privacycg/meetings/blob/main/2021/telcons/08-12-21-FPS-adhoc-minutes.md
among other places.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Above comment also applies to later places in this PR where ownership verification is assumed to be part of the process. In the investigative journalism and forensic accounting fields, verification of ownership is generally considered a hard problem (example: https://www.transparency.org/en/publications/beneficial-ownership-how-to-find-the-real-owners-of-secret-companies )
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed. This amendment uses notaries and EV SSL certificates to establish proof. Notaries have not to the best of my knowledge been considered by prior work. As better solutions come forward these can be added to. However we should not constrain innovation by seeking to find the perfect solution to this problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add material to the UA Policy Proposal page to explain how the IEE would handle this? It's not clear to me how this proof would work.
The current version of UA Policy Proposal includes the TLS Certificate approach under "Alternatives Considered and Discarded." It would be helpful to get a better sense of the capabilities required of the IEE in this version.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The UA Policy Proposal document has not yet been updated to align to the revised GDPR Validated Sets README. The latest commit renames these additional documents to add to the opening title in those documents explaining they are yet to be revised. If it would be cleared to delete them temporarily let me know.
There is no IEE in this proposal as an IEE would be a central point of authority, and the proposal likely suffer the same issues as IAB EU / TCF2.
- `mirror.co.uk`, `thesun.co.uk`, and `telegraph.co.uk` | ||
- Participants in a scheme or network. | ||
- `visa.com`, `visa.co.uk`, `barclaycard.co.uk`, and `nationwide.co.uk` | ||
- Decentralized networks implemented by multiple legal entities. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a very general description that could apply to the entire WWW. Possible to clarify?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is a benefit of web primitives that provide a lot of utility.
Of course SWAN.community is a specific example I have in mind. Any other network that requires information sharing among joint data controllers or processors that is lawful would be equally valid.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If all lawful data sharing is in scope, evaluating set validity could be arbitrarily time-consuming for the IEE. Needs a better definition here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why wouldn't all lawful data sharing be in scope of a technical specification produced under forums of the W3C?
I think IEE means Independent Enforcement Entity. Correct? If so then there is no IEE required or needed in this proposal. Existing justice mechanisms for fraudulent claims of ownership are supported. As noted in the proposal this improves the current situation but does not remove the possibility of there ever being bad actors. Whilst there is free will in the world this is an unrealistic goal and not within the scope of the Privacy CG.
- browsers to understand the relationships between data controllers and domains such that they can effectively present | ||
that information to the user. | ||
- browsers to understand the consent choices of users to better capture and respect their preferences. | ||
- existing justice systems to be used to identify and sanction bad actors under laws. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to understand cross-jurisdiction issues here -- what happens when the user is not in the same jurisdiction as any set member?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would be covered in the specific text of the use policy. As such it's not within the scope of the proposal. However as people will be asked to consent in a neutral fashion to a use policy the well written and popular use policies are likely to become popular and endorsed by authorities trusted by users. Enabling users to find out more about a particularly authorities position on a use policy would be a sensible enhancement. i.e. user X wants to know what consumer rights organisation Y things of use policy Z. The point is the user agent and W3C standards are enabling this not restricting it.
…the out of scope section to make it clear the proposal does not attempt to address the problem of proof of ownership completely, but does improve the current situation, and also explain how the proposal would over time incorporate other legal frameworks.
… been updated as part of the proposal at this stage. Once the changes to the README have been reviewed then these documents can be similarly amended.
@jwrosewell Has the CMA stated that First Party Sets is not acceptable to them? Have they notified Google in a public document about First Party Sets specifically? Has the CMA addressed First Party Sets in any of their comments? Has the CMA stated that they believe that First Party Sets will fragment the web and that they will not allow Google to implement it? I am very concerned with this claim that there are specific changes that need to occur because the CMA will require specific behavior or changes when I do not see that claim from the CMA. |
So judging by your changes and this statement are you proposing that this technology only be usable in states with GDPR?
This would seem to entirely negate any theoretical privacy promise that First Party Sets could preserve? It's pretty core to the First Party Sets proposal as I understand it that sets be limited so that any given domain is only in one set. Without that guarantee the we would essentially create a situation where any site could add another site to their Set and unlock whatever privileges that browser theoretically would allow them. This would almost certainly have downstream effects of ad tech providers and other tracking mechanisms exerting pressure on sites to be added to their Set. The biggest companies could exert the largest amount of pressure in that case and therefor be the most likely to be adopted into a Set. Wouldn't that make this proposed change... more anti-competitive then? I only mention this question because you are specifically citing the CMA in making this change, but your analysis states a primary concern is:
and a proposal which allows a party to be in multiple Sets would surely do that, as Google would come in with the largest leverage to have their domains included in Sets, right? |
Re: CMA. MOW have provided a summary of the commitments and how they relate to each Google PS proposal at a high level. I do not believe that First Party Sets (FPS) as presented will meet those commitments. However it is for the CMA to provide a decision. Rather than waste people's time discussing FPS I'm suggesting that Google ask the CMA explicitly before advancing the proposal. That will be quick and simple for them to do. I make the same request of Google in all Privacy Sandbox (PS) proposals. To try and advance the concepts in the meantime I have proposed GDPR Validated Sets (GVS) which achieves the same benefits for sole data controllers (FPS owners) AND joint data controllers without requiring a central Independent Enforcement Entity (IEE). Should GVS advance then the need for many other PS proposals and associated complexity would likely reduce providing certainty to the digital market and people. Re: GDPR. Once a solution that Google, the CMA and others agree would meet GDPR requirements then the solution could be adapted for other rules. Google have committed to achieving GDPR level privacy globally so at least three entities see GDPR as the right starting point. Re: Domains can appear in multiple Sets. The information is held within the Set. A domain that was in multiple Sets could read and write from those multiple Sets. The use purpose under which they do so will be defined for each Set. If they break that purpose and process people's data in a way that they have not been authorized for then they can be sanctioned under existing laws. Re: "exerting pressure on sites to be added to their Set" & "Google would come in with the largest leverage to have their domains included in Sets, right?". That is a market issue the choices GVS makes available to participants on the web might result in should Google (or any other dominant party) adopt such a strategy. It is not our role to address such market issues or such bad acts and actors. We do have to be mindful of them and ensure we do not create unintended market issues. I'm glad you raise the impact on competition as it is a subject we need to openly debate at the W3C. Your example is exactly what is happening with the myriad of changes and proposals being advanced by user agent vendors that mandate others "do as they say" without any option to take a different route. It is exactly what is happening when data asymmetries exist in a market. I look to the CMA and others to intervene in such matters and not the W3C, IAB TL, IETF, or other standards setting bodies. We just need to be mindful to avoid hosting anti competitive proposals or "tilting the scales". Ultimately the GVS amendment provides much more information to people, their agents, and society than the status quo without limiting competition. |
@jwrosewell You mention that your proposed changes to First Party Sets would not require an Independent Enforcement Entity (IEE) but it is still not clear from this PR how the IEE's responsibilities would be handled. Changing FPS to assert that the IEE is not needed without explaining who would take on the tasks that are handed by the IEE in the original version makes this PR incomplete. Is any web user supposed to be able to check a notarized document from any site? It seems like a lot of knowledge would be required to browse a site outside your own jurisdiction. (Personally, I just got a document notarized in California, but I would not know how to check for the validity of a notary seal from Nevada or Oregon.) |
I would normally agree with you and I wouldn't have brought it up here but considering the metric by which you are identifying FPS as a proposal applicable for censure by the CMA and the CMA's goal is to deal with Google potentially being anti-competitive shouldn't your proposal deal with the ways it might make Google uniquely and specifically anti-competitive? That's the whole reason we're having this discussion right? I'm not a lawyer so I am unclear on how the CMA applies its logic though you seem to be stating that you are so I would be interested in the alterations for this proposal being more specific in stating reasons why the proposal (which is being made at least somewhat on the basis that the CMA would find FPS anti-competitive right?) specifically is less anti-competitive? Because if you are claiming:
And the CMA has made no specific statements about this specific proposal (right?), then the only basis on which you are steering this proposal is specifically to make it less anti-competitive? I mean, if--as it looks to me--your changes would make it more anti-competitive then I would assume it would not accomplish your stated goal of "steer[ing] the proposal and discussion towards a solution that the CMA will find acceptable".
I generally agree to this in some respects (I think making standards that are not easily exploitable by bad actors is in our role) and so I'm not so interested in a general policy of speaking about how proposals are in regard to market competition, I think that's a different conversation entirely for a different place. But since you have stated that this specific issue is what you are seeking specifically to address, I think your PR needs additional text, either here or in the body of the proposal, explaining how it specifically addresses that, which is unclear to me at this time. Sorry for this longwinded response but wanted to make my position very clear here, but the TL;DR is: If your PR is intended to accomplish this:
And the basis on which they theoretically wouldn't find it acceptable is that it is anti-competitive (right?); then I do not think the PR has sufficiently met its self-stated goals. |
I don't disagree with that but then why remove the notice process? Surely a notice process is in line with GDPR and without it we are limiting the applicability of this proposal because we are acknowledging that
Those are big issues and limiting our ability to deal with them to delegating those processes on to GDPR would mean that the solution is unfit to be rolled out anywhere else, right?
This seems unlikely to work as a solution. As we've already seen with numerous reported cases of consent fraud parties regularly do not comply with their contractual process for user data and the nature of the blackbox system in which these actions occur makes it very very difficult to catch them in the act. |
The proposed changes would provide people or their agents information about ownership and agreed use purposes if they wished to inspect them in a consistent manner. The role of the IEE in identifying and eliminating bad acts and actors would be fulfilled by the community of users and their agents. The proposal is therefore decentralized and has a low cost to operate and police.
FPS favours large sole data controllers that can operate many services themselves. CMA want a level playing field based on GDPR. Google agreed to this. GDPR includes the concept of joint data controllers. GVS expands FPS to support joint data controllers so that smaller organisations that must band together to compete with larger organisations can achieve the same benefits. GVS also provides people more control by enabling them to consent to use purposes once rather than individual B2B organisations. In the next iteration of the document I’ll make this clearer in the body as @AramZS suggests. Thanks.
No. First Party Sets are covered. See para 32 here.
This proposal enables consent to a use purpose (a contract that can be adhered to by multiple parties) which forms the notice under GDPR. It does not remove the notice. It is true that the proposal takes a different approach to notice and consent and in doing so addresses the criticisms of current models for consent. Just like Open Source licences industry will likely gravitate towards a small number of use purpose notices that will become best practice for specific purposes.
Whilst there is free will in the world then there will always be bad actors and bad acts. GVS adopts an approach to addressing the potential bad acts by improving information to enable existing law enforcement mechanisms to sanction bad actors when they occur. It does so without requiring a central IEE and all the problems that such a central operation entails. |
@jwrosewell Thank you for the answer. So I'm not clear where "validated" comes in in the proposal title. Sites can make claims about compliance with a particular law. It doesn't seem like the browser would need to behave differently just because a site, or group of sites, had made some unverified claim. (The original First-Party Sets is a trade: the user gives sites permission to use additional browser functionality in exchange for the enforcement services of the IEE. It can be a good deal for the user if the impact of the extra browser functionality is balanced with the value of the protection that the IEE provides.) |
@dmarti Validated refers to the ownership of the domains either via a) the Extended Validation SSL certificate process; or b) notaries. Once there is a method of establishing ownership proof AND consented use purpose agreements then a validated set can be formed without requiring a centralized enforced authority (aka Independent Enforcement Entity). Thus browsers would enable data sharing among the domains that formed the set and be able to present proof to users concerning that data sharing. The major difference in this regard is the removal of the single IEE to create a proposal that is truly decentralized and utilizes existing established processes for validating ownership and transparency concerning data use. |
@jwrosewell Who is actually validating, though? I understand that this proposal allows for domains to make claims about privacy policies and other issues, but from the user point of view they might be claims about business relationships I don't understand in a language I can't read. (I can recognize a few kinds of notarized legal documents in the jurisdiction I live in, but if a document comes from some places I couldn't tell if it's a GDPR policy, the web developer's high school diploma, or a Chuck E. Cheese gift certificate) Unless every user becomes an expert in every kind of notarized document everywhere, someone has to validate them. |
EV certificate metadata may not be sufficient to distinguish similarly-named-but-different entities. In order to take advantage of this proposal, EV certificates would likely be required to include additional data. See https://arstechnica.com/information-technology/2017/12/nope-this-isnt-the-https-validated-stripe-website-you-think-it-is/ for more information on this threat model. |
@eligrey Thank you. I linked to the criticisms of EV in the PR as well as the process for EV. As a minimum it should help the group understand how complex solving data processor / controller identity is. If the EV process didn't get it 100% right after all this time then it is probably unrealistic that a new IEE is going to get it 100% right and in any case is duplicating a lot of work that has already been done and in place. My intention is to enable a path forward which recognizes nothing is 100% perfect but a lot better than what we have today. In parallel improvements to EV could be made. @dmarti As we have no validation of domain ownership today beyond branding and the SSL certificate providing notarized proof is an improvement. It is not perfect, just like EV, or indeed an IEE. Like may aspects of the web (fake news, phishing scams, etc) we can work as a community to identify and eliminate bad actors. By using only documents produced by registered notaries the user can be certain a qualified professional verified the ownership. Identifying registered notaries is a simpler problem than identifying legal owners of all web sites. If cryptographic proof of the notaries involvement is desirable then that might be a modification to add in due course. However the first thing is to establish that all solutions are imperfect and determine the concepts that can be taken forward. I require decentralization (no single IEE) in any solution, and don't want to "reinvent the wheel" when there are at least two solutions that are acceptable in other fields. |
I agree with @eligrey on this point. Is the proposal that a single EV certificate would be used across all sites in GDPR Validated Set? If so, this seems to widen the threat model rather than narrow it. From my experience in operating one of the largest PKIs (the one for cable modems), one of the greatest challenges in operating any PKI is the governance of it. Who decides the process for validating the participants (deciding who gets to participate and who doesn't) and determining when certificates need to be revoked (e.g., compromised certificates or certificate authorities or entities falling out of compliance with the processes and procedures). Without proper governance certificates might as well just be self-signed. With cable modems, while the scope was much more clearly defined, it was easier, but still quite a challenge. In the context of the web, much, much more challenging. |
The name has been changed from First Party Sets to GDPR Validated Sets.
In summary this PR revises the proposal to align to GDPR, introduce user consent, introduce two flexible methods to validate ownership, and provide evidence to sanction bad actors should harm occur.
This PR expands the Site Groups concepts first proposed by Paul Bannister in issue #22 and this proposer in #23.
Changes in this PR include.
Reviewers Note
Google employees will already be familiar with the 4th February 2022 agreement with the CMA, the training requirements in Annex 3, and the CMA note paragraph 4.119, “Google has committed to instruct its staff and agents not to make claims to other market players that contradict the commitments, and to provide training to its relevant staff and agents to ensure that they are aware of the requirements of the Final Commitments”.
Other reviewers are not bound by these commitments but might wish to familiarise themselves with the CMA’s role in approving changes to Google’s web browser. This PR and proposal are intended to steer the proposal and discussion towards a solution that the CMA will find acceptable whilst retaining all the features of the original First Party Sets proposal. Paragraph 21 of the commitments states “During the standstill period, the CMA may notify Google that competition law concerns remain such that the Purpose of the Commitments will not be achieved. Google will work with the CMA without delay to seek to resolve concerns raised and address comments made by the CMA with a view to achieving the Purpose of the Commitments. Google will inform the CMA of how it has responded to those comments”. First Party Sets are explicitly covered by the commitments. Spending time discussing or developing the original First Party Sets proposal will either a) fragment the web as Google will not be allowed by the CMA to implement the proposal; or b) merely waste effort and time delaying the realisation of the benefits envisaged.
As the proposer is not a party to the agreement between Google and the CMA, and the CMA are not represented in the Privacy CG, Google may wish to validate the direction of this proposal with the CMA and ICO prior to commencing review within Privacy CG.
For those wishing to better understand the commitments an explanation has been produced by Movement for an Open Web (MOW).
For those that have followed the discussion in PATCG concerning the role of consent in proposals MOW has similarly produced an explanation of consent in light of the commitments.
Explanatory Note
Google and the UK CMA agreed on 4th February 2022 to align exclusively worldwide Alphabet originated proposals and changes to GDPR. It is probable other W3C and Privacy CG members support this direction.
GDPR has no concept of first and third party or domain names. This PR commences the process of aligning this proposal to GDPR by establishing an upgraded privacy boundary for data sharing within the user agent based on GDPR. Proposers of First Party Sets already understand the need for such a change to the privacy boundary and that is not repeated in this PR note.
Legal Ownership
A prerequisite for such a solution is the need to establish the legal ownership of domains. The proposal now incorporates SSL EV certificates and the use of notaries to verify ownership.
Expeditious
Accepting these changes takes nothing away from the original First Party Sets proposal whilst expanding the participants to align with GDPR and thus addressing competition concerns associated with the original proposal.
Should the concepts be adopted then the complete retirement of unrestricted third-party cookies might be expedited.
Decentralized
These changes further avoid the need for a central authority to determine whether a Set is or is not valid, and remove ambiguity associated with reliance on brand trust or prior recognition. By identifying a common use policy for data shared within the Set users can either be prompted to accept the common use policy once for all validated members or provide approval for each member. There is no option to reject the common use policy in line with established practice for services that require sign in. i.e. one can not access the service without registering and signing in.
Consent Fatigue
Users will now have the option to avoid repeatably being asked to make choices for each website they interact with. The net results are an opportunity to improve the web experience for people without any centralized authority.
Innovation
By enabling controlled data sharing the broadest number of web participants can drive innovation and improvements to the web. Such innovation will not be restricted to web browser implementors.
Policy Wording
These changes do not provide guidance concerning the construction of common use policies. A common use policy is represented by a well-known end point. GDPR guidance is used.
Privacy Issues
The original version of the proposal contained privacy issues documented in the repository concerning the validity of a set and the potential for abuse by bad actors. These changes improve the privacy of the proposal by surfacing the common use policy associated with the Set. These changes do not attempt to fully resolve all privacy issues raised against the original proposal exclusively via engineering. The changed proposal achieves improved privacy for users by identifying the common use policies used by all parties and enabling existing law enforcement and sanctions to be applied. As such the proposer of this change requests reviewers consider the role of methods other than engineering alone are used to mitigate risk of harm on the web and how the proposed changes align to GDPR.