Skip to content

Latest commit

 

History

History
158 lines (133 loc) · 19.1 KB

2020-10-13-TPAC2020-minutes.md

File metadata and controls

158 lines (133 loc) · 19.1 KB

TPAC 2020: Conversion Measurement API

Agenda

  • Introduction, web.dev article, and Origin Trial discussion
  • Event-level data vs. aggregate data for conversions
    • Which use-cases require event-level data?
    • Possible discussion of further privacy for event-level data
  • Other use cases that require aggregate measurement besides conversions
    • E.g reach measurement or other use cases that fit in the Aggregation Service view of the world
  • Discussion of measurement + fenced frames (incrementality, etc)
  • Discussion of 3p measurers, i.e. multiple third parties that want to observe a given user journey. Issue
  • Resolving open issues on the issue tracker
    • E.g. #40 / #69 around authentication
  • Interest in recurring meetings through WICG
  • Any other business

Attendees — please sign yourself in!

  1. Charlie Harrison (Google)
  2. Kris Chapman (Salesforce)
  3. Michael Kleber (Google Chrome)
  4. Lionel Basdevant (Criteo)
  5. Arnaud Blanchard (Criteo)
  6. Joshua Koran (Zeta Global)
  7. Marçal Serrate (Hybrid Theory)
  8. Paul Bannister (CafeMedia)
  9. Shigeki Ohtsu (Yahoo JAPAN)
  10. Basile Leparmentier (Criteo)
  11. Mike Pisula (Xaxis)
  12. Przemyslaw Iwanczak (RTB House)
  13. Jon Burns (Shopify)
  14. Wendy Seltzer (W3C)
  15. Raj Belur (Amazon)
  16. Viraj Awati (Amazon)
  17. Chris Needham (BBC)
  18. Allen Fung (ShareThis Inc.)
  19. Lukasz Wlodarczyk (RTB House)
  20. Valentino Volonghi (NextRoll)
  21. Kenneth Christiansen (Intel Corporation, TAG)
  22. Jonasz Pamuła (RTB House)
  23. Andrew Pascoe (NextRoll)
  24. Daniel Smedley (Booking.com)
  25. Ryan Arnold (Procter & Gamble)
  26. Brian May (dstilery)
  27. John Delaney (Google)
  28. Paul Marcilhacy (Criteo)
  29. Jukka Ranta (Comscore)
  30. Diarmuid Gill (Criteo)
  31. Garrett Johnson ( Boston University)
  32. Pedro Alvarado (Resonate)
  33. Erik Anderson (Microsoft)
  34. Jeddy Chen (Magnite)

Notes

  • Charlie Harrison, Chrome:
  • See agenda above!
    • Introduction to Conversion Measurement API, and future directions
  • Brief background and summary:
    • Basic use case: Be able to measure conversions, without 3rd-party cookies or any other cross-site-tracking identifiers
    • Done via on-device attribution: Your ad engagements (eg clicks) are registered with the browser, your conversions (eg purchases) are also registered with browser, browser has the job of linking them up
      • See ad on shoes.example: ad click gets a click ID that the browser remembers
      • Buy shoes on shoes.exmaple: browser asked to record the purchase
      • Join internally the ad event and the conversion event
      • Now we have controls to make this more private
    • API right now: offering event-level information about the ad click (the ad click ID), joined up with coarse information about the conversion (conversion data is an enum, limited to 8 possible values — coarsely bucket type of conversion, e.g. low-value purchase, high-value purchase, email sign-up)
    • Packaged up into a report, then sent to the ad platform after a delay
    • Similar idea to Safari Private Click Measurement, with some key diffs:
      • Allow 3rd-party delegation of this job
      • Allow fine-grained information about which click it was
      • For conversion data, applying some noise to further enhance privacy — with some probability, the browser will falsify the metadata about conversion. So you won't be completely sure about any one user's actions on the advertiser's site, but you can learn things in aggregate, knowing how much noise you expect to see in your data
    • Reveals a lot less data about user's browsing history than 3p cookies, which allow joining of arbitrary information of user on news.example with user on shoes.example. With 3p cookies, can join up identifiers across the two sites; with this API you can't
  • Arnaud Blanchard: can you elaborate more on the level of noise you expect to add? WIll it be uniform across players, and affect smaller players more as a result?
    • Charlie:
      • Noise level not decided, just an initial prototype
      • If noise isn't conducive to some use cases, can reevaluate
      • In prototype in Chrome now:
        • conversion data is an enum of 8 possible values
        • With 5% chance, browser will report a random enum value, instead of the one specified by the API call
        • Intent is to allow more precision once you aggregate more reports
        • Script in the GitHub repo that shows how to remove noise, and give un-biased estimators of the true values based on the noised data
    • Arnaud: Conversions are rare events, don't see that many, so even a small amount of noise can make marketers' lives hard. Worried that it could prevent them from using the framework, find work-arounds instead. Noise level choice could be critical. Most valuable for marketers to play with the early version, see how the noised data changes their ability to understand, give feedback
    • Charlie: Yes, please give feedback. We considered prototyping with noise about whether a conversion happened or not, but this prototype doesn't include it — noise is only about which enum value the conversion goes with
  • Basile Leparnentier:
    • You say we have a lot of leeway in putting 64 bits into the click ID. How can we populate that? In TURTLEDOVE we're blind about lots of stuff. If information is only available when there is a click, how can we get information about what happens when there is no click? So much is hidden, that it's hard to know enough to really use.
    • Charlie:
      • Primarily this was developed before TURTLEDOVE, and it's primarily targeting contextual ad requests, but we can work on figuring out how it could work better for TURTLEDOVE. In Contextual targeting, we have all the information about the signals, so everything can be joined with the 64-bit click ID.
      • Better for TURTLEDOVE is probably the aggregate API instead — offers better trade-offs on what information is embedded in the click and what comes from the conversion. TURTLEDOVE is a good fit here, since a lot of the information embedded in the creative choice is aggregate also
      • We definitely want to support the right reporting for TURTLEDOVE ads, as discussed in SPARROW, and we can work on something that is a better fit.
    • Basile: Why would be move to this API, when Google Analytics is so much better? Analytics can show a lot of things, "conversions" is very wide, and this API is a regression in terms of accuracy or quality of the data.
    • Charlie: Goal is to enable this use case, or at least a part of this use case, without the need for 3rd-party cookies or other tracking identifiers. We want to make this possible in a privacy-preserving way. Goal of the Privacy Sandbox is to tamp down on all the ways to do cross-site tracking and replace them with purpose-built APIs. As ways of cross-site tracking become harder to do on Chrome and other platforms, this API will be there for you.
  • Valentino Volonghi:
    • It's a little ironic that this is meant more for contextual (which measures more multi-touch) and less for TURTLEDOVE, whose interest-based campaigns are bigger sources of conversions. This spec offers no possible path for attribution to views rather than clicks, which are a big deal for some ad formats (e.g. page take-overs).
    • Charlie:
      • We do expect ways to do interest-based ads in a contextual auction; see FLoC, where you see some broad interest-based ad targeting with full joinability and use of this API
      • View-through case that you're mentioning is definitely one that we consider important in Chrome, thanks for the feedback that it's important for you too. Would be good to have an iteration on this API to support the view use case. Primary reason they're not supported right now are tricky privacy problems that might be hard to solve with an event-level API; they can be solved with an aggregate API. Big Q: Is aggregate data for views enough, or do we need this event-level data? Could give something like this but with more privacy retstrictions, and apply it for view-through measurement, but without the clear user action (click), there are lots of privacy problems we'd need to address. (e.g. log fake views to sensitive.com, have that site always fire a conversion, now the sites you visit learn which sensitive sites you've visited) So we need to add privacy to alleviate these attacks
      • Valentino: In the attribution use case, granularity is not relevant; goal is to see the impact of advertising, want to know data about groups of people who did/didn't see ad. Of course great to get individual user journeys, but from the privacy point of view, what you really need is aggregated event streams. Questions: how that would work when evern streams are so different? Don't need identifier as long as the groupings are good and noise isn't too large.
      • For the optimization use case, Impossible to create negative label set in TURTLEDOVE use case (people that we chose not to show an ad to), so hard to show differential impact. Need some form of access to the negative set.
      • Charlie: I think you should be able to get some access to negative set — should be able to get denominator for ads that were shown, and should get that along with ads that converted, these kinds actually converted. Good enough?
      • Valentino: No — Supposed to be generating a bid for combination of contextual and IG. If I want a more elaborate approach then I need the same level of detail in reporting as in keys I generate. Need more granularity in reporting to be able to use it to bid.
      • Charlie: Answer lies in divorcing the ad-render step from the auction step. In a TURTLEDOVE world we're reporting the results of auctions, but we could do something more clever than that — could get the ad creative from the auction, and craft a report in a context where we know everything we need. This could allow very simple auctions but still allow for rich reporting. Should be robust enough to specify different information in report than what's used in the bid — as long as report goes through privacy-safe aggregators
      • Basile: I think reporting and bidding are tied together — need to be able to optimize the same way as you bid, the two are very linked. Cannot do the bidding without the reporting
      • Charlie: A report can be more rich than the function that is used to generate a bid. True in SPARROW and in TURTLEDOVE.
  • Kris Chapman:
    • I understand why you're starting with clicks, but wanted to give feedback again that clicks are just a very small portion — at Salesforce we look at Customer Journeys quite a bit, and click data can only be 2% of the story, 98% is very different, esp. cross-device or cross-browser. People that click on ads are one type of person; see lots of people who decide not to click the ad, but instead see the ad and then type the product into the browser bar, to avoid tracking. Focusing just on clicks is bad incentive — we need to support other types of insight, so that we can get better understanding of performance
    • Charlie:
      • Intention of clicks-first is that it's the easiest thing to ship in a way that is privacy-safe-enough to see how it works; definitely not a final state. Moving attribution to be done on device is a really big change, which we need to validate. We understand that it won't recover your conversion measurement, and even that it is initially biased. But it should give you an opportunity to look at your click-based conversions now with cookies and with this new API and detect where it's buggy, where it isn't good enough, while we're tuning the privacy knobs.
      • Cross-device: FB has brought this up in the past too. We don't have any proposals currently for how to implement; FB has a cross-device conversion proposal in web-adv BG a while ago, which we could look at as an enhancement to recover more of the cookie functionality

NOTE: WE ARE LAUNCHING THE API IN ORIGIN TRIAL in Chrome 86!

If you want to try getting real-worl data, it's ready for you to try — you can experiment on real users with a small percent of your traffic — sign-up link for Origin Trial

  • Joshua: my microphone says it is blocked.In the last meeting Michael said that if we cannot provide marketers a solution that is comparable to their other options then it would not be viable. Related to Valentino’s question, how do marketers learn which engagement tactics (e.g., advertising on which context/publishers or cohorts or frequency of exposure) drives more value to them (e.g., more visits or customers) to adjust their bid submissions or bid adjustments?"
    • Charlie: Goal of this API is to let you adjust bids based on data, which we know is an important use case, heard esp in response to Safari PCM — there's so much signal embedded in data from bid time, so huge feature vectors are useful and it's why we get full fidelity on model inputs. Output of an ML model can be used for training even if it's much coarser.
    • Joshua: (vox this time yay!) How can marketer understand which combination of tactics produce better values without some system aggregating across different browsers / different clients? If I spend $100 on pub1 and $100 on pub2, how can I decide which to buy on in the future? (Either contextual or TURTLEDOVE, both ways) We need to enable marketers to understand the impact of their media spend.
    • Charlie: For this event-level API, good to scope to non-TURTLEDOVE because it's simpler to discuss. In that setting, what you can get is a unique identifier for every ad that was clicked, associated with some valuable event/conversion that happened on the advertiser's page. That unique ID should allow you to join up with all the information that went into ad selection and the strategies for bidding. Along with that, we get a low-grain signal for how it did — at the lowest level, just "was there a conversion", but some little additional signal with the enum. That should let you join up with all the information you have about bidding strategies, and measure how well they did according to conversion rate or some other coarse signals.
    • Please file an issue and follow up outside the meeting
  • Arnaud Blanchard: Post-click attribution is kind of a step backwards; many advertisers focusing on more meaningful signals to understand the impact of the ad on the user. I fear that this reporting API combined with the lower reporting capabilities of TURLTEDOVE could allow malevolent marketers to game these reporting. How can we better measure the incremental effects on the user?
    • Charlie: Incrementality and lift studies are definitely something we're interested in supporting. Difficult to support with all of our privacy goals, but here's a sketch of what might work:
      • This API doesn't satisfy incrementality use cases, it's just the simplest thing to start
      • For incrementality, we need more pieces to establish causal relationships and measure conversion lift.
      • Need view-through conversions, and need counterfactual ad campaign measurement. Definitely will support it in aggregate.
      • Need user diversions into A/B experiment groups. Can do this with 1p cookie if you only care about diversions on a single site, but in Display, to consistently A/B-divert users across multiple sites, it's a privacy challenge — the diversion itself is a fingerprinting vector. (With 100 A/B ad campaigns, divert on each of them = 100-bit user identifier.)
      • CAn support these using a Fenced Frame for rendering, like TURTLEDOVE uses, plus an A/B diversion API, plus aggregated reporting.
      • Inject into the Fenced Frame two creatives. You don't know which one got actually shown, so parent page can't learn diversion. But inside the fenced frame, can do consistent diversion, do reporting using the same aggregate techniques for views/conversions like in TURTLEDOVE.
      • Rough around the edges, but they are the same primitives that we're building out. Will publish an explainer with more details, this is only a 10,000-foot view.
    • Arnauld: This could be a real change for advertising — no consistent way to do A/B diversion today, it would be great if we could put A/B incrementality on a clear and consistent footing, could be a big help
    • Charlie: George London has a PR in the web-adv BG with a lot of parallels here
    • George London: The idea is based on what CHarlie just mentioned — A/B testing API to control which peopl see real ad and which see fake/blank. For Upwave (formerly Survata), we show surveys, so that we can find out how ads they see affect survey responses later. THinking about doing that now by letting ad impression set interest group, then showing surveys using TURTLEDOVE-like bidding mechanism and using Fenced Frame for showing surveys, aggregate reporting to get survey results back, do the statistics.

CHarlie: Please use GitHub for issues, there is a lot of great interest here. Gauge interest: Do people want recurring meetings like this, under WICG incubation auspices?

+1's in chat and by voice.

Q: How will we know that you've set up something like that?

Charlie: Will make a GitHub issue to announce scheduling of a meeting, then publish notes to the GitHub repo itself. (Could come up with a schedule, or could do one-offs)

  • Jukka Ranta: I'm worried about lack of participation from browsers besides Chrome. If balance moves too much into the "functionality" direction and not privacy, if people get the feeling that there is too much tracking, why wouldn't they just choose another browser? How do we get everybody on board? Do you expect publishers to block access from browsers that don't implement APIs?
    • Charlie: We intend to come up with APIs that would be acceptable to other browser vendors, under some privacy level. Privacy CG face-to-face had some discussion of this. Other proposals, like Aggregate Measurement, could let all browsers have their privacy requirements satisfied. Browsers should also offer ways to turn things off, for browsers who are more privacy-conscious and don't want any of these, so you don't need to switch browsers. But we don't want publishers to discriminate against people based on their browsers
    • Jukka: SHould strive for Differential Privacy
    • Charlie: Event-level Conversion Measurement is not, but Aggregate Conversion Measurement is.