-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Set up integration testing and plug in some statistics #31
Conversation
Interesting -- I wonder if this integration testing belongs in mozanalysis, tbh. |
It's to check if tables/views get created by pensieve as expected and to see if metrics and statistics are calculated. Ideally, integration testing would be done in both projects. |
3ac5d4c
to
60ee5da
Compare
command: | | ||
export GOOGLE_APPLICATION_CREDENTIALS="/tmp/gcp.json" | ||
echo "$GCLOUD_SERVICE_KEY" > "$GOOGLE_APPLICATION_CREDENTIALS" | ||
venv/bin/pytest pensieve/tests/integration/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have tried using a pytest marker for this, however for some reason it gets ignored. As a workaround, I moved all test that would require GCP authentication to a separate directory. This way it is easier for folks to ignore running these tests locally if they are not logged in to GCP.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How do you mean that it gets ignored? Markers don't change what executes unless you specifically include or exclude them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, it should work by adding an option -m integration
or -m "not integration"
, but even if I set these options, all test still get executed.
pensieve/analysis.py
Outdated
|
||
table_id = f"{self.project}.{self.dataset}.statistics_{result_table}" | ||
|
||
job = self.bigquery.client.load_table_from_dataframe(df_statistics_results, table_id) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently, uploading STRUCT / RECORD fields from load_table_from_dataframe is only supported with pyarrow 0.17.0
which is not published on pip, yet (but should be soon).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should put together a proposal for what these tables should look like.
This sets up end-to-end testing for pensieve. I created a test project on GCP for this so that some data can be written to BigQuery. This is also useful for running metrics and statistics related tests with some actual (mock) data.