-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Why are multi-party computation solutions the only ones that should be considered? #4
Comments
I don't think there was any indication anywhere that that is the case. We're at a stage of weighing different approaches of which MPC is one. |
Martin Thomson's presentation specifically steered the group to the use case of attribution and multi party computation. This issue was raised to find out why. |
Well, that presentation is just Martin's opinion (though I happen to agree with it), and much of the presentation was concerned with his arguments for why, so I'm not really sure what the question is. My expectation is that towards the end of tomorrow we'll discuss which use cases to start with first, so that will be the time to discuss that. I doubt we're ready to decide whether MPC is the right angle yet. |
To be very clear, what I thought I said (and what I should have said if I didn't) is that systems based on multi-party computation seem the most likely to produce the privacy outcomes that I consider acceptable. They also exist (too many of them, sadly, but the problem of choice is better than the alternatives). I have not seen any alternative proposal that produces acceptable properties. That doesn't mean that this is not possible, just that systems like PCM (Apple) or event-level reporting (Google) reveal unacceptable amounts of information about the browsing activities of individuals. Specifically, they allow details of the interactions of a user with one site to be made available to a different site. This is, of course, a property that is shared by many - if not all - current attribution systems, especially those that use cookies or link decoration. |
Martin, could you pull apart your thoughts about MPC, which is an internal implementation detail, from your thoughts about the information revealed, which is an essential part of the API shape? That is, could you be happy with a non-MPC-based system that only revealed aggregate outcomes? |
+1 to Michael about teasing apart reasons why we like MPC. It would be a good outcome of this group to come up with a trust model classification of various architectures / implementations of an aggregate system. In particular I would be interested in the group's views on non-malicious secure MPC systems which we didn't really have a lot of time to discuss today. cc @marianapr |
What do you mean by "non-malicious secure MPC systems" |
There's "only revealed" and "is never available to any single system". I think the latter is the requirement |
I meant something like an "honest but curious" / "semi-honest" security model, which as far as I know is the current security model of IPA. |
There needs to be protection against malicious clients for sure. At least malicious behavior should be detectable. |
@kimlaine I think @csharrison's commenting on the server threat model, which currently assumes HbC rather than malicious security, but is something that Ben said they're working to improve. (Sorry if that's pedantic!) |
@ekr By "is never available to any single system" I think you mean "We want to design an aggregation system in which no single [malicious|compromised] party can get non-aggregated data". Is that right? |
I suggest we move this discussion to a better scoped Issue. The headline here is plain inaccurate. Continuing to discuss under it normalizes poor behavior. |
I acknowledge that a specific technology (in this case, MPC) has been proposed to support an advertising measurement use case. I don't think it is productive to question why this solution has been proposed and try to pick apart issues with it. Rather, we should focus on the other discussion around Privacy principles, ensure we've defined what problems we're trying to solve, what the threat model is, and then propose a suite of tech/solutions that could help us move forward. My understanding is that MPC would definitely be one of those. |
I think @ssanjay-saran has very well stated the core of this issue, it will be more useful to document what MPC intendeds to address, why, and why it might be the better approach than another potential approach. As @alextcone notes, it is not the formal position of this group (since we have authored no papers at this time) that MPC is the only solution. |
FYI I changed titles to better reflect the content as it developed. Will close after meeting later unless Chairs advise otherwise. |
Changing the name makes the conversation no longer make sense. I would like to raise tonight that the topic of an issue thread not get changed by the author. If you haven't collected your thoughts enough to title an issue correctly out of the gate, that may be a sign to sit with your thoughts and reflect a bit longer. |
I will add that one of the nice features of MPC is that the IETF is working on the Privacy Preserving Measurement (PPM) spec, which essentially standardizes a subset of MPC solutions (Verifiable Distributed Aggregation Functions (VDAFs)). In my opinion, leveraging other standards like these will both help this group build consensus and help build confidence in the security and privacy properties of solutions developed by this group. I agree with those above that PPM/VDAF/MPC aren't the only paths available to us, but they are useful work we can build upon. |
FWIW @alextcone I was very happy with the question as raised in the meeting but was advised to post offline. I did. I changed the title after reading your prior post to better reflect the content. |
@alextcone, I just changed the name back (I agree that the discussion stopped making sense under James' new title). To @michaelkleber's question, the logic is simple:
Like @eriktaubeneck, this isn't an absolute position, it's a prediction or even a guess about what is most likely to work. It's not saying that alternatives don't exist, but that they seem less likely to be able to address the requirement. |
For the kicks, let me try the following perspective. Say I am from a smaller country somewhere that is not North America. I am told that data is being sent from my browser but not to worry -- my privacy is protected by MPC magic. From this angle, a TEE (or even plain old hardware) run by people I trust and understand beats a MPC being run somewhere far away by people I have a reason to be suspicious of. TL;DR: perhaps we should not get too hung up on the TEE vs MPC vs something else distinction -- the context matters. |
@palenica I think privacy concerns regarding collusion and user trust in which parties is totally relevant, and maybe that could be added to the web advertising privacy principles doc (which I believe @darobin volunteered to edit). Not sure we have separate issue or repo for that yet, but that would be a good discussion to continue there. |
I've been trying to figure out how to think about MPC and non-MPC systems on an equal footing, and it seems to me that it's not as binary as our in-person discussion depicted it. @ekr took the position that for a Trusted Execution Environment approach like Amazon Nitro, there is not robust protection against an attacker with physical or side-channel access, so "you need to trust Amazon" — i.e. we need to pessimistically act as if Amazon can observe all the data the TEE processes, can steal the crypto keys the TEE uses, etc. To make a reasonable comparison, then, where do we expect the MPC helpers are embodied in the physical world? In particular: if a system's privacy requires two non-colluding helpers, then they must be running on two different cloud providers, and that those cloud providers are trusted to be non-colluding as well? |
At minimum, 2 or more unconnected entities, in different legal juridictions where there are suitable and mutually recognised data protection and privacy laws in force. |
@michael-oneill That is a plausible interpretation of our discussion so far, but I'm not ready to believe it is an economically and structurally reasonable expectation for the API. |
Not for the API, but the browser providers implementing it could block reports unless that constraint is met. |
@michaelkleber You are referring to different level of economical expectations than that you mentioned yesterday about the "how many zeros", but I am curious how much more do we want to pay for privacy (both with TEEs and MPCs)? It is clear that even doing no heavy crypto/computations, we will double/triple the price due to the storage of same data with MPC. Is it clear what the upper bound is to consider the scalable effort for MPC? I don't have experience with TEEs myself, but I am aware of its difficulties with small enclaves and scalability from various groups in industry that required significant effort working with Intel engineers etc. These are costs to me. Do you have experiments with TEEs for any of the use cases discussed last two days? If so, it would be super interesting to see for me. I am curious how much more we are willing to pay for TEEs? |
There have been several bits and pieces of discussions that should get collected in the principles doc — might we impose on the chairs to set up the repo, since it's an accepted deliverable anyway? (cc @AramZS) To add one more dimension to the MPC solution space, it might be too difficult or too costly to obtain some properties through purely technical means (collusion-resistance, adversaries tougher than HbC). Conversely, a pure governance-based model might offer too few guarantees (even if trusted, having one big DB of all browsing activity is never an acceptable level of risk). But a hybrid model could use a governance model to provide non-collusion and honesty in support of an MPC system. @michael-oneill I wonder if a single entity in an EDPB-adequate jurisdiction could be enough to provide guarantees (assuming the entity were required to legally resist). But that's a thorny aspect that would require some pretty in-depth legal analysis. |
I’d like to propose to break this apart a bit further, and suggest there is a privacy requirement, which calls upon a data security requirement. Here is the proposed template for the privacy standard: Data can only be processed off-device if those mechanisms have (1) sufficient security guarantees to ensure that any query or access to the data can only result in outputs that are (2) sufficiently privacy preserving. I believe most people in this group would agree with something similar to the above for some definition of (1) and (2) and that most of the debate we have right now is about what constitutes a sufficient bar for (1) and (2). I propose that we try to get consensus on the above template, before we try to proceed with definitions of (1) and (2). |
Thanks, Alex & Ben.
|
Sorry, signed in with the wrong account earlier! |
This is a really interesting discussion! I'm also wondering about how things might change over time to iteratively improve the guarantees that we can give. (For example, maybe we rely on governance for some things to start with that we can later include technical solutions for?) I suppose this brings us back to @rmirisola's point about working out what's sufficient to begin with. |
Yes, we've considered governance solutions before. The proposal on the table for that is known as Garuda. It was designed for a partly different use case, so don't worry too much about the details. The important parts are that:
I don't want to oversell it and I'm pretty convinced that if we head in that direction, whatever we build will be different from that first draft. But after a fair amount of research and having presented the system to a number of people who study this type of arrangement, I think it can work. There's more precedent for this kind of commons-based infrastructure management than people realise, too, though not necessarily in a transnational setting processing the data of 4bn people for a half-trillion dollar industry. |
@darobin while that might (subject to suitable review) be sufficient for people who understand how various risks are being mitigated by the design, I think there's an aspect of @palenica's point which is more about people who won't... who will they consider trusted (whether operating a TEE or relying exclusively on protocol guarantees built in to use of MPC)? Although eventually this reaches beyond the responsibility of this CG or even W3C, our architecture might be usefully influenced if there's any research (including if we could provoke someone to do new studies) on what trusted could look like. We don't want to design and deploy something that in some parts of the world spawns a successful grassroots campaign for everyone to opt out -- if a different approach could achieve a more accepted outcome. (Also the annoying pedant in me wants to point out that adequacy isn't permanent, and that it is part of an increasing number of non-European laws which means it can't be guaranteed to be either reflexive or transient. I'll try to keep the pedant under control by point out that most of these countries are very likely to confer adequacy on the EU.) |
No description provided.
The text was updated successfully, but these errors were encountered: