Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gorouter: HTTP query parameters cannot be extracted from access logs #173

Closed
plowin opened this issue Jul 10, 2020 · 14 comments
Closed

Gorouter: HTTP query parameters cannot be extracted from access logs #173

plowin opened this issue Jul 10, 2020 · 14 comments
Labels
help wanted We think the change requested makes sense and would welcome a PR.

Comments

@plowin
Copy link
Contributor

plowin commented Jul 10, 2020

Issue

In our environment, some stakeholder cannot prevent their customers from sending "secret" data in the GET parameters of their HTTP requests. A configuration to suppress query parameters in access logs would be helpful.

Context

We are consuming cf-deployment and would be happy if such a feature could reach one of the next gorouters/routing-releases. The issue has been discussed on Slack with @ameowlia. Even though sending secrets in GET parameters is listed as a CWE, the use-case exists and a workaround was also supported in the CF UAA

Steps to Reproduce

  • Send an HTTP request including a GET query parameter to the gorouter
  • Find the parameter on the gorouter at /var/vcap/sys/log/gorouter/access.log

Expected result

  • Send an HTTP request including a GET query parameter to the gorouter
  • Do not find the parameter on the gorouter at /var/vcap/sys/log/gorouter/access.log

Possible Fix

On slack it was suggested to dig for the respective code around the function makeRecord

Additional Context

I am not very familiar with the source code of the gorouter but would also try to work on a contribution for this issue.
Any hints or links on how to get started are highly appreciated.

@mike1808
Copy link
Contributor

Do you want to suppress all query strings or just ones matching some regular expression?

@plowin
Copy link
Contributor Author

plowin commented Jul 13, 2020

Hi Mike,
this is rather for all query strings as there is and will be no mechanism for platform consumers to make configuration changes to the gorouter. And we cannot provide such a regex for all use-cases.

Is the solution in the UAA based on regex?

@domdom82
Copy link
Contributor

my 2 cents: instead of just dropping the entire query part we could hash it and write the hash in the log. this would still allow detection whether the same query was sent or a different one.

e.g.:

GET /myEndpoint?secret=shouldNotUseGetForThis

becomes

GET /myEndpoint?redacted=830e49141480973f26cc4ad9355ea21d0653701f

instead of

GET /myEndpoint

@mcwumbly
Copy link
Contributor

Is the solution in the UAA based on regex?

I'm not sure, but I appreciate that question. What other established patterns are there for doing something like this?

Here's a related issue in envoy, for example: envoyproxy/envoy#7583 From a quick search, it doesn't look like anything has been merged yet for this (but I could be wrong).

@KauzClay
Copy link
Contributor

Hey @plowin ,

It looks like the way UAA handled this was by specifically redacting the values for query parameters with names password or client_secret

Relevant UAA commits:

I am not very familiar with the source code of the gorouter but would also try to work on a contribution for this issue.
Any hints or links on how to get started are highly appreciated.

We would gladly accept a PR. However, lets agree on the design of the feature we want here first. We are happy to follow UAA's lead by redacting query parameters like password and client_secret. These could simply be hardcoded or exposed as a bosh property with those two values as defaults. So then the property may be something like router.redact_query_parameters which would take a list of strings.

If that sounds reasonable, we can help you with getting started. But if you have a different idea in mind, let's sort that out first.

@KauzClay KauzClay added the help wanted We think the change requested makes sense and would welcome a PR. label Jul 29, 2020
@domdom82
Copy link
Contributor

I think the main difference to UAA is that we are UAA, we own it. So we know there are query parms like client_secret or password to hide. With gorouter we only forward whatever parms the (unknown) user app needs. So we can't make assumptions on which params are sensitive and which aren't.

My thinking goes along these lines: Since we can't make assumptions on which params to hide, therefore we have to hide all of them. However this means two GET requests to the same path will look the same, even though they aren't. (one might take a lot longer b/c it had a page size of 500 while another one only had 10) So to retain at least the information if two requests are different, we could hash the query params and log the hash along with the request.

I could think of a three-way flag like

router.redact_query_parameters = <none|all|hash>

where none means "log all params like today"
where all means "log no params of the query part of the URL"
where hash means "log a hash of all query params instead"

@mcwumbly
Copy link
Contributor

@domdom82 that makes sense. I think your design makes sense if meets your requirements. If someone wants to add a denylist option in the future, this feature could be extended further. I

Do you anticipate that you would prefer to use the hash option yourselves?

@domdom82
Copy link
Contributor

domdom82 commented Jul 30, 2020

Do you anticipate that you would prefer to use the hash option yourselves?

Sure. We get constant pings by customers who want help with their apps. It would be useful for us to be able to at least tell two of their requests apart :-)

@mcwumbly
Copy link
Contributor

mcwumbly commented Aug 6, 2020

@domdom82 and @plowin - are you still interested in working on a PR for this feature?

If so, let us know if you need any help along the way...

@domdom82
Copy link
Contributor

domdom82 commented Aug 7, 2020

I am on parental leave till mid September, I can provide a PR by then. @plowin if you or someone from the routing team wants to jump onto this earlier, feel free :-)
The crucial part lies within access_log_record here: (call to RequestURI) https://github.com/cloudfoundry/gorouter/blob/main/accesslog/schema/access_log_record.go#L155

We will have to introduce a config flag here: https://github.com/cloudfoundry/gorouter/blob/main/config/config.go#L120
Probably similar to the ones that are already there.

The new flag needs to be exposed via BOSH spec probably somewhere around here: https://github.com/cloudfoundry/routing-release/blob/develop/jobs/gorouter/spec#L178

And don't forget the tests :-D

@domdom82
Copy link
Contributor

Hi @mcwumbly @KauzClay I have opened #183 and cloudfoundry/gorouter#274 to support the feature as discussed. Feel free to comment / complain :)

@ameowlia
Copy link
Member

Hi @domdom82

🙏 thank you so much for submitting these PRs. I will add it to our backlog to review.

@jrussett jrussett added scheduled We agree this change makes sense and plan to work on it ourselves at some point. and removed scheduled We agree this change makes sense and plan to work on it ourselves at some point. labels Oct 1, 2020
@ameowlia
Copy link
Member

This has been completed in this PR by @domdom82! Thanks so much! This will be included in the next release.

@ameowlia
Copy link
Member

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted We think the change requested makes sense and would welcome a PR.
Projects
None yet
Development

No branches or pull requests

7 participants