Integrate with DLP #182

raserva · 2022-03-02T23:33:51Z

Add integration with Cloud Data-Loss Prevention (DLP). This will allow us to redact sensitive information from requests & responses in our audit logs. POC here: #176

sethvargo · 2022-03-03T18:12:12Z

I looked through the POC. A few thoughts:

Having DLP in-the-loop for each incoming audit entry feels expensive. Are we sure this is the right architecture?
How much of this will be user-configurable?
I'm curious about the decision to scrub instead of reject. If I can convince DLP that my email address is PII (hint: it's pretty easy), I could get away with bad things now.

raserva · 2022-03-03T18:20:43Z

Are we sure this is the right architecture?

Nope. That was a POC, not intended to be a a final solution. I wanted to see
A) how easy it was to integrate
B) that it could be done in real time

Solutions we've discussed up to this point have generally done in real-time. It is cheaper to do an asynchronous and non real time solution, however we thought the additional complexity unnecessary given our low TPS

How much of this will be user-configurable?

Undecided. Part of this would likely be to create a design, and whether we want to have a standard configuration for redaction, or allow users to create their own redaction configs. Other solutions in the area (Dryad) use a standard configuration for all.

I'm curious about the decision to scrub instead of reject

It only scrubs the request and response values, NOT the principal or other fields in the audit log. So your email would still be associated with whatever data you access, its just the accessed data that would be redacted. You can see this is the example provided in the POC:

{
    ...
      },
      "request": {
        "message": "[EMAIL_ADDRESS] [DATE]",
        "target": "3c04c892-c532-4f70-a27d-52d6b5cc3ec5",
      },
      "method_name": "abcxyz.test.Talker/Hello",
      "authentication_info": {
        "principal_email": "rsrv@tycho.joonix.net"
      },
    },
    ...
},

Originally:

{
    ...
      },
      "request": {
        "message": "me@example.com 3/4/2020",
        "target": "3c04c892-c532-4f70-a27d-52d6b5cc3ec5",
      },
      "method_name": "abcxyz.test.Talker/Hello",
      "authentication_info": {
        "principal_email": "rsrv@tycho.joonix.net"
      },
    },
    ...
},

sethvargo · 2022-03-03T18:31:23Z

Thanks for the reply. That all makes sense. I think the "DLP in the path of the request" is still a concern we should discuss more. It impacts our maximum availability and we'd need to look into quota and QPS bits.

yolocs · 2022-03-05T00:15:36Z

The meta point: I meant DLP integration to be a "quick" feature that adds potential value to lumberjack. There isn't any clear requirement for it. So if it's going to complicate the lumberjack architecture by a lot (which is likely the case with async DLP processing), we should table the idea.

Having DLP in-the-loop for each incoming audit entry feels expensive. Are we sure this is the right architecture?

Considering the amount of audit logs won't be on the same scale as debug logs, I think the higher cost should be acceptable. Plus, this is meant to be an optional feature. E.g. an org could require product teams to not log req/resp (we have knob for that) if sensitive data is expected there.

How much of this will be user-configurable?

We will minimally have:

toggle on/off DLP integration
choose a default DLP config to use
best-effort DLP (toggle), meaning if DLP check fails, ignore the error and continue the audit logging

I'm curious about the decision to scrub instead of reject. If I can convince DLP that my email address is PII (hint: it's pretty easy), I could get away with bad things now.

scrub vs. reject - could be a global config (in addition to the ones above)

raserva added enhancement New feature or request proposal Something might be useful and should be discussed labels Mar 2, 2022

yolocs added the iceboxed label Mar 10, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Integrate with DLP #182

Integrate with DLP #182

raserva commented Mar 2, 2022

sethvargo commented Mar 3, 2022

raserva commented Mar 3, 2022 •

edited

Loading

sethvargo commented Mar 3, 2022

yolocs commented Mar 5, 2022

Integrate with DLP #182

Integrate with DLP #182

Comments

raserva commented Mar 2, 2022

sethvargo commented Mar 3, 2022

raserva commented Mar 3, 2022 • edited Loading

Are we sure this is the right architecture?

How much of this will be user-configurable?

I'm curious about the decision to scrub instead of reject

sethvargo commented Mar 3, 2022

yolocs commented Mar 5, 2022

raserva commented Mar 3, 2022 •

edited

Loading