Agenda Request - what is the right scope for functionality that should be supported in the first iteration of the joint measurement effort? #56

marianapr · 2022-05-20T02:13:45Z

In the sessions dedicated to measurement use-cases deep dive and requirements Charlie presented several applications and their importance. We started a discussion what subset of these applications should be supported in the first design version of the measurement effort. Most of this discussion was focused on whether optimization should be a requirement. I also want to add here the question whether we should require support for attributions outside the user device, this will enable cross-device attribution but will come with complexities related to the joining. Is it worth starting with a solution based on the current instantiations of the PPM standardization effort in IETF (one advantage of this is that such solutions have been deployed and are still running in practice for other aggregation applications such as the Exposure Notifications Private Aggregation, https://github.com/google/exposure-notifications-android/blob/master/doc/enexpress-analytics-faq.md)

It will be great to continue this discussion here and get opinions from more people about what are the minimum functionality capabilities that will make the outcome of the first iteration of the measurement design useful for them.

csharrison · 2022-05-31T19:43:25Z

Thanks for filing this @marianapr. I am generally OK with scoping to an as-simple-as-possible MVP, but I would prefer we pick an overall architecture which allows for the flexibility we think we need to support future use-cases (e.g. ML training, etc.).

All else equal, I do think it would be beneficial if we could take advantage of all the work on PPM / DAP / VDAFs happening in IETF, but I think if it conflicts with the longer term goals I think it's OK to take a step back and re-evaluate the needs of the system rather than constraining ourselves to that work.

jalbertoroman · 2022-06-01T07:42:56Z

If you are interested in exploring Federated Learning (Horizontal and Vertical), multi-task learning or split learning we can help defining the architecture for on device training and inference.

ekr · 2022-06-02T22:38:11Z

Thanks for filing this @marianapr. I am generally OK with scoping to an as-simple-as-possible MVP, but I would prefer we pick an overall architecture which allows for the flexibility we think we need to support future use-cases (e.g. ML training, etc.).

By "overall architecture" do you mean API? Or do you mean something like server architecture?

Regardless, it seems to me that we've heard two substantive questions:

Should we expand the scope to include optimization?
Should we contract the scope to not include cross-device conversions?

I try to come at both of these from the perspective of what we know how to do. I.e., we should target as MVP something we most know how build. Based on the discussion last time, I think that rules out optimization, which seems like an open problem in a pretty substantial way. If there are things we know will rule it out that don't otherwise make things better, let's not do those, but otherwise I think we should defer it.

WRT cross-device conversion, my sense is that it's somewhere between a nice to have and a very important depending on who you talk to. This brings us to whether it significantly complicates things to implement it. To which the answer is.... maybe?

csharrison · 2022-06-03T14:37:45Z

By "overall architecture" do you mean API? Or do you mean something like server architecture?

I mean the high-level architecture of the API, e.g. whether attribution happens on device or not, or whether API choices fix a particular query pattern or allow dynamic queries.

Based on the discussion last time, I think that rules out optimization

I think that is a little bit reductive. I think there are good results from the Criteo AdKDD challenge that show that a fairly simple aggregate system can perform logistic regression. The techniques needed to achieve that result seem generally applicable to other reporting use-cases too (e.g. multi-query scenarios). I agree general purpose ML might be a bit out of scope though.

marianapr · 2022-06-20T21:51:33Z

@AramZS I saw that you put the discussion on this topic tonight. I know that today is a holiday for many of the companies in the US. I think this discussion only makes sense if we have a quorum of representatives.

csharrison · 2022-06-20T22:26:49Z

I know it's pretty late notice, but given that we only have two technical topics considered right now in the agenda, it might make sense to push them all on one day to have a better chance of getting quorum.

marianapr · 2022-06-21T02:07:29Z

+1 to what Charlie said, can we merge the two topics in one day - I am not sure the East Coast people can make two midnight meetings in the same week.

martinthomson · 2022-06-21T02:43:10Z

Others of us managed to attend when the timing was bad for us personally. We all have to contend with the occasional bad hour or public holiday conflict.

I'm personally not very happy with the late setting of the agenda for this particular meeting. Receiving confirmations the day of the meeting has meant that I'm a little behind. I might be ready in time for the second session, but it will be a push to get it done with less than 7 hours notice for the first.

I understand if @marianapr has similar challenges; we've not had a lot of notice. But if the problem is that there has been a lack of preparation, I don't want that being used to marginalize people who are disadvantaged by the timing of other meetings. If the goal is to share the burden of meeting at awkward hours, cancellation works directly against that goal.

Ideally, we would have an agenda one week before a meeting; the requests were submitted well in advance of that. Then lack of preparation time would not be an excuse to shorten those sessions that happen to be inconvenient for certain geographies.

marianapr · 2022-06-21T02:51:47Z

I agree that advance notice for preparation would be appreciated. Today is a holiday for me and I did not have much time to prepare, hence I will prefer putting the two topics that are overlapping quite a bit on Wednesday.

AramZS · 2022-06-21T05:06:11Z

My apologies I also was coming in off of time off and had an unexpected lack of internet access, this falls on me and shouldn't happen again. I do think we can move this to the 2nd session, especially since it seems that there is a feeling the two strongly overlap. That said, I think we can spend some time this first session to set up the discussion more effectively, especially since we intend to discuss scope.

benjaminsavage · 2022-06-22T01:31:04Z

Hi all,

At the end of our day-1 discussion on this topic, I promised to file two issues to help structure the conversation in day-2. I have now done so.

Strawman privacy constraints for a private measurement API: Strawman: Target privacy constraints private-measurement#17
Strawman target MVP functionality for a private measurement API: Strawman: Target functionality for MVP private-measurement#16

I'm looking forward to knocking down these strawmen together on day-2! I'm posting here to give everyone at least 24 hours to read and reflect on these before our next discussion.

csharrison · 2022-06-23T02:39:27Z

Here's a (somewhat ad-hoc) list of use-cases we can prioritize, and possibly reduce down if we can immediately reject some use cases as infeasible.

Basic functionality (probably table stakes?)
- Support click-through conversions
- Support view-throughs / opportunities (e.g. for lift measurement)
- Aggregate conversion counts
More complicated queries
- Aggregate value sums
- Dynamic queries (as opposed to fixed queries / buckets embedded in the browser reports like in ARA / PCM)
- Multiple breakdowns
- Flexible sensitivity management (i.e. exploit advanced dp composition)
- Adaptive queries
Attribution scope
- App <-> web
- Cross device
- cross publisher / cross channel
- online-to-offline conversions
Attribution function
- Last-touch attribution
- “simple” multi-touch (rules-based)
- “complex” multi-touch (DDA, shapley, etc)
Third party reporting (probably need to break this down into constituent use-cases...)
ML training / optimization (breakdown from @jpfeiffe )
- Simple optimizations like computing averages seem in most proposals.
- SGD for small, dense networks. Privacy here is much easier as you can clip the norms and add noise in each dimension.
- SGD for large sparse models, which the sparsity makes challenging
- Decision trees (no proposals yet, to date, and likely would need a TEE or something to do)
Near-real time reporting

I think many of these are likely achievable in an MVP, and hopefully more in any extensions we want to add.

jpfeiffe · 2022-06-23T02:57:21Z

Full fledged ML training / optimization (e.g. agg service runs a DNN)

I think breaking this down a bit might be nice. E.g.,

Simple optimizations like computing averages seem in most proposals.
SGD for small, dense networks. Privacy here is much easier as you can clip the norms and add noise in each dimension.
SGD for large sparse models, which the sparsity makes challenging
Decision trees (no proposals yet, to date, and likely would need a TEE or something to do)

csharrison · 2022-06-23T03:08:46Z

Let me edit my list @jpfeiffe , I completely agree, especially given that some of the simple optimizations should be compatible in a wide range of proposals.

martinthomson · 2022-06-23T07:53:48Z

Presumably you want "offline <-> online" in both directions. And maybe add multi-valued (vector) outputs to the complicated queries.

benjaminsavage · 2022-06-23T08:07:38Z

I vote to NOT shoot for the following things in the MVP

online-to-offline conversions
“complex” multi-touch (DDA, shapley, etc)
ML training / optimization (breakdown from @jpfeiffe )

Unless it is much better specified so that we can engage with specific use-cases I'd also suggest we jettison

Third party reporting (probably need to break this down into constituent use-cases...)

AramZS · 2022-07-21T14:48:01Z

Participants: did we want to expand this conversation in the upcoming meeting?

marianapr · 2022-07-21T19:54:56Z

I think it will be great to hear from more participants what is a useful MVP from their point of view. I think the applications and the functionalities mentioned in the talks from the previous meeting and the discussions on the issues can be a good starting point. We can also follow what Charlie was suggesting to split the applications in categorized: absolutely required for an MVP to be useful, nice to have , and advanced capabilities.

AramZS · 2022-07-25T15:36:20Z

@marianapr I think that's reasonable. I'll add this to the Agenda at the top of day 2.

AramZS · 2022-07-25T15:54:30Z

I'm interested in hearing folks who would like to lead this discussion and potentially leaders around each suggested section to help guide the discussion:

Overview
Absolutely Required
Nice to have
Advanced Capabilities

benjaminsavage · 2022-07-25T15:58:02Z

I doubt we will be able to get a representative selection of API users for this next meeting. Rather than try to host this particular discussion live - some kind of a survey would probably give us better data about which features are most critical.

AramZS · 2022-07-25T17:03:33Z

@benjaminsavage I agree and I don't think this needs to be comprehensive but we can continue the conversation and maybe talk through what we want in such a survey?

marianapr added the agenda+ Request to add this issue to the agenda of our next telcon or F2F label May 20, 2022

marianapr assigned AramZS and seanturner May 20, 2022

marianapr mentioned this issue Jun 6, 2022

Agenda Request - Cross-Device Attribution #58

Closed

AramZS added this to June Agenda Jun 20, 2022

AramZS moved this to Day 2 in June Agenda Jun 20, 2022

AramZS moved this from Day 2 to Day 1 in June Agenda Jun 20, 2022

npdoty mentioned this issue Jun 23, 2022

Want to add to the upcoming meeting agenda? Here's how #11

Open

AramZS removed the agenda+ Request to add this issue to the agenda of our next telcon or F2F label Jul 19, 2022

AramZS added the agenda+ Request to add this issue to the agenda of our next telcon or F2F label Jul 25, 2022

AramZS removed the agenda+ Request to add this issue to the agenda of our next telcon or F2F label Aug 29, 2022

AramZS closed this as completed Apr 11, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Agenda Request - what is the right scope for functionality that should be supported in the first iteration of the joint measurement effort? #56

Agenda Request - what is the right scope for functionality that should be supported in the first iteration of the joint measurement effort? #56

marianapr commented May 20, 2022

csharrison commented May 31, 2022

jalbertoroman commented Jun 1, 2022

ekr commented Jun 2, 2022

csharrison commented Jun 3, 2022

marianapr commented Jun 20, 2022

csharrison commented Jun 20, 2022

marianapr commented Jun 21, 2022

martinthomson commented Jun 21, 2022

marianapr commented Jun 21, 2022

AramZS commented Jun 21, 2022

benjaminsavage commented Jun 22, 2022

csharrison commented Jun 23, 2022 •

edited

Loading

jpfeiffe commented Jun 23, 2022 •

edited

Loading

csharrison commented Jun 23, 2022

martinthomson commented Jun 23, 2022

benjaminsavage commented Jun 23, 2022

AramZS commented Jul 21, 2022

marianapr commented Jul 21, 2022

AramZS commented Jul 25, 2022

AramZS commented Jul 25, 2022

benjaminsavage commented Jul 25, 2022

AramZS commented Jul 25, 2022

Agenda Request - what is the right scope for functionality that should be supported in the first iteration of the joint measurement effort? #56

Agenda Request - what is the right scope for functionality that should be supported in the first iteration of the joint measurement effort? #56

Comments

marianapr commented May 20, 2022

csharrison commented May 31, 2022

jalbertoroman commented Jun 1, 2022

ekr commented Jun 2, 2022

csharrison commented Jun 3, 2022

marianapr commented Jun 20, 2022

csharrison commented Jun 20, 2022

marianapr commented Jun 21, 2022

martinthomson commented Jun 21, 2022

marianapr commented Jun 21, 2022

AramZS commented Jun 21, 2022

benjaminsavage commented Jun 22, 2022

csharrison commented Jun 23, 2022 • edited Loading

jpfeiffe commented Jun 23, 2022 • edited Loading

csharrison commented Jun 23, 2022

martinthomson commented Jun 23, 2022

benjaminsavage commented Jun 23, 2022

AramZS commented Jul 21, 2022

marianapr commented Jul 21, 2022

AramZS commented Jul 25, 2022

AramZS commented Jul 25, 2022

benjaminsavage commented Jul 25, 2022

AramZS commented Jul 25, 2022

csharrison commented Jun 23, 2022 •

edited

Loading

jpfeiffe commented Jun 23, 2022 •

edited

Loading