Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is output domain files mandatory in Aggregation Service? #606

Closed
k-o-ta opened this issue Nov 9, 2022 · 6 comments
Closed

Is output domain files mandatory in Aggregation Service? #606

k-o-ta opened this issue Nov 9, 2022 · 6 comments

Comments

@k-o-ta
Copy link
Contributor

k-o-ta commented Nov 9, 2022

In my use case, I want to receive the summary report including all keys(buckets) inputted to Aggregation Service.
Is there a way to skip using output domain files, or wildcard syntax which enable to accept all keys(buckets)?

By the way, what is a purpose of output domain files, reducing load of Aggregation Service or protecting user's privacy?

@csharrison
Copy link
Collaborator

Hello @k-o-ta , this is currently necessary for protecting the user's privacy. There is some discussion of an alternative privacy mechanism in #583 to support the "key discovery" use-case you describe.

Would you mind sharing the use-case where specifying the output domain is challenging to you?

@k-o-ta
Copy link
Contributor Author

k-o-ta commented Nov 9, 2022

Thanks for the quick reply!
I(as a DSP) am considering using summary report as ad performance report for all customers(advertisers). In this case, all keys(buckets) are needed.
When reports are grouped by CampaignID, I think of two choices.

  1. Preparing output domain files including all CampaignIDs. When a new campaign is created, add the CampaignID to output domain files.
    • This means we must maintain CampaignIDs list apart from Campaign Database.
  2. When receive a triggering attribution, update the output domain file.

If I can skip filtering summary reports by keys(buckets), I can omit such as those above and reduce the maintenance cost.

I created this issue because I didn't know that output domain files is necessary for protecting the user's privacy. Is there a document which explains how output domain files protect the use's privacy?

@csharrison
Copy link
Collaborator

Thanks for explaining, @k-o-ta . I understand how (1) works, but for (2) how do you know the campaign at trigger time? Isn't that a function of the ad served?

With respect to the privacy, I don't know if we have a great document explaining this, but at a high level the reason we ask you to specify the output domain is to protect against an attack where the presence of a key in the output reveals something about a single user / event. For example, if you only showed a campaign to one user, receiving a key in the output (even with noise) reveals that that user later converted. By specifying this domain beforehand, we can be sure that it does not reveal anything about the user contributions.

Typically, key discovery mechanisms which do not require specifying an output domain (like those that are being discussed in #583) involve some form of thresholding to protect against that attack. You can see this document for an example algorithm:
https://github.com/google/differential-privacy/blob/main/common_docs/Delta_For_Thresholding.pdf

@k-o-ta
Copy link
Contributor Author

k-o-ta commented Nov 10, 2022

Thank you for explaining!

As you say, I can't know CampaignID at trigger time (without 3rd party cookies)! I had made misunderstanding.

Let me clarify my understanding. Is specifying domains beforehand for maintaining differential privacy ? (I referred to this doc.)

@csharrison
Copy link
Collaborator

Let me clarify my understanding. Is specifying domains beforehand for maintaining differential privacy ?

Yes this is one purpose of specifying the domain, because differential privacy cares about preventing this attack I mentioned above. There are techniques to achieve DP without specifying the domain but they typically involve some thresholding step.

@csharrison
Copy link
Collaborator

csharrison commented Nov 10, 2022

Hey @k-o-ta I am going to close this issue in favor of the existing #333 which I think is touching on the exact issue you're hitting. Let's continue any other conversation there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants