Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

filter_sensitive_data and secret in URL when URI is used for matching #221

Closed
maelle opened this issue Jan 29, 2021 · 24 comments
Closed

filter_sensitive_data and secret in URL when URI is used for matching #221

maelle opened this issue Jan 29, 2021 · 24 comments
Labels
Milestone

Comments

@maelle
Copy link
Member

maelle commented Jan 29, 2021

No description provided.

@maelle
Copy link
Member Author

maelle commented Jan 29, 2021

cc @bretsw

@maelle
Copy link
Member Author

maelle commented Jan 29, 2021

Problem: an API where the API key is passed as URL parameter.
It seems to me that vcr doesn't transform the URI before trying to match it URI in cassettes.

So

  • in the vcr cassette the API key is replaced with "my sensitive data"
  • but the URI provided in a request isn't. Consequences: no match is found in the cassette... and the URL is printed in the vcr error message, URL that contains the secret API key. Therefore on CI the secret API key ends up in the artefacts.

@maelle
Copy link
Member Author

maelle commented Jan 29, 2021

this doesn't explain why the tests pass on my machine though?!

@maelle
Copy link
Member Author

maelle commented Jan 29, 2021

Example of a CI build whose artefacts contain a key (sharing now that the key has been made invalid)
https://github.com/bretsw/tidytags/actions/runs/518185227

@maelle
Copy link
Member Author

maelle commented Jan 29, 2021

In the artefact testthat.Rout.fail

An HTTP request has been made that vcr does not know how to handle:
GET http://api.opencagedata.com/geocode/v1/json/?q=New%20York%2C%20NY&limit=1&no_annotations=1&no_dedupe=0&no_record=0&abbrv=0&add_request=1&key=%2208c2339f937d4c8e9e390853e7fb50e1%22
vcr is currently using the following cassette:

@maelle
Copy link
Member Author

maelle commented Jan 29, 2021

@sckott note that the problem does not happen when there is no environment variable set by GitHub Action https://github.com/bretsw/tidytags/actions/runs/521323642 (this is a fork)

@sckott
Copy link
Collaborator

sckott commented Jan 29, 2021

The first thing that looks odd to me is that they exposed key has quotes around it, Why would it be in quotes?

@maelle
Copy link
Member Author

maelle commented Jan 29, 2021

yeah weird

@sckott
Copy link
Collaborator

sckott commented Jan 29, 2021

@bretsw To deal with the quotes issue, did you store your opencage key with quotes around it in the Github secrets settings?

@maelle
Copy link
Member Author

maelle commented Jan 29, 2021

in the log there was no quote

image

@bretsw
Copy link

bretsw commented Feb 4, 2021

@bretsw To deal with the quotes issue, did you store your opencage key with quotes around it in the Github secrets settings?

I think this was likely the issue with the OpenCage key. I cleared out my GH secrets and the build seemed to work (at least for OpenCage). Still perhaps not great that a failed build exposed a secret, but helpful to narrow in on the problem

@sckott
Copy link
Collaborator

sckott commented Feb 5, 2021

Sorry for delay on this, have this on the to do list for tomorrow

@bretsw
Copy link

bretsw commented Feb 5, 2021

No problem! Thanks for tackling

@sckott
Copy link
Collaborator

sckott commented Feb 5, 2021

It seems to me that vcr doesn't transform the URI before trying to match it URI in cassettes

I don't think this is a bug in vcr's request matching given that this only happened in one test - i'll keep looking of course, but I don't think it's a problem in vcr's matching

@sckott
Copy link
Collaborator

sckott commented Feb 5, 2021

I've been trying to replicate this problem with this dummy R package https://github.com/sckott/fizball - but have not been able to make a key be hidden in the logs but not hidden in the artifacts when there's a test failure

any ideas for what to try?

@maelle
Copy link
Member Author

maelle commented Feb 10, 2021

@sckott the tests/.Rout.fail contains the key https://github.com/maelle/fizball/actions/runs/554705151 🎉

I saved it with quotes see below

image

@maelle
Copy link
Member Author

maelle commented Feb 10, 2021

The whole recipe for failure is

  • filter_sensitive_data with a secret that has been fed with quotes to GHA (it is an easy mistake to make)
  • secret in URL

the failures are

  • vcr can't recognize the URL for some reason
  • the secret ends up in the artefact (with quotes but could still be scraped maybe)

@sckott
Copy link
Collaborator

sckott commented Feb 10, 2021

Thanks. I guess one approach is to strip any single or double quotes around a secret before using it - BUT it's possible that a user could have quotes in a secret; seems unlikely a secret would have quotes at beginning and end of the string though.

@maelle
Copy link
Member Author

maelle commented Feb 11, 2021

one thing I don't understand is why vcr can't recognize the URL i.e. why it can't match when there are quotes? something with the string manipulation doesn't work in that case?

Your approach sounds good (removing when there are quotes at the beginning and at the end), but it'd be good to have a warning about the quotes being stripped, with some option to turn this warning off (to be used in the very unlikely case that such quotes are needed).

I was also thinking that folks might fix their secret on their own once they try doing real requests (turn vcr off) in a workflow, because their real requests will fail.

Also, I need to add screenshots to the book so oprened an issue there: ropensci-books/http-testing#82

@sckott
Copy link
Collaborator

sckott commented Feb 15, 2021

why it can't match when there are quotes?

It's identical() internally I think, so if if url doesn't match exactly, one has quotes and one does not, then there's no match.

Agree if we strip leading and trailing single/double quotes, then throw a warning about it

@maelle
Copy link
Member Author

maelle commented Feb 16, 2021

yes sounds good

@sckott sckott added this to the v0.7.0 milestone Feb 17, 2021
@sckott sckott added the filters label Mar 3, 2021
sckott added a commit that referenced this issue Mar 4, 2021
@sckott
Copy link
Collaborator

sckott commented Mar 4, 2021

okay, @bretsw @maelle just pushed a fix. Can you try again if you get a chance?

@maelle
Copy link
Member Author

maelle commented Mar 8, 2021

It seems to have worked https://github.com/maelle/fizball/actions/runs/631706778

@sckott
Copy link
Collaborator

sckott commented Mar 8, 2021

thanks for testing

@sckott sckott closed this as completed Mar 8, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants