Deploy Datasette to fly.io instead of Cloud Run #3018

jdangerx · 2023-11-06T17:43:13Z

This got a little more complicated than I had hoped, because I ran into image size limitations with datasette publish fly. There are a few workarounds, documented in the publish.py script.

However, this deploy script now works from a pudl-etl docker container running in GCP! So I have hope that it should work in the nightly build.

Right now, the entire "inspect databases, compress databases, build docker image, deploy" process takes about an hour:

20 min to inspect
5 min to compress
25 min to build docker image, mostly in uploading 3.6 GB of context to the build server
10 min to upload docker image
2 min: actual startup time once the docker image is uploaded to the fly.io registry

This might go faster on a bigger box - the daz-dev VM is on e2-standard-4 and the nightly builds run on e2-highmem-8.

Before we merge this, we need to add a FLY_ACCESS_TOKEN to github secrets, or pull the secret from Google Secret Manager within the container. Open to either approach.

After we merge this, we need to:

wait for nightly builds to successfully deploy to fly.io
move our DNS to point data.catalyst.coop at catalyst-coop-pudl.fly.dev
delete our cloud deploy service

codecov · 2023-11-06T19:15:04Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (f25e921) 88.7% compared to head (a4eb7ac) 88.7%.
Report is 5 commits behind head on dev.

Additional details and impacted files

@@          Coverage Diff          @@
##             dev   #3018   +/-   ##
=====================================
  Coverage   88.7%   88.7%           
=====================================
  Files         91      90    -1     
  Lines      11010   10988   -22     
=====================================
- Hits        9768    9752   -16     
+ Misses      1242    1236    -6

Files	Coverage Δ
src/pudl/metadata/classes.py	`86.6% <100.0%> (+<0.1%)`	⬆️

... and 2 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

bendnorman

Looks good! Mostly clarifying questions.

devtools/datasette/publish.py

.gitignore

bendnorman · 2023-11-07T02:18:10Z

docker/gcp_pudl_etl.sh

@@ -97,7 +97,7 @@ if [[ ${PIPESTATUS[0]} == 0 ]]; then
    # Deploy the updated data to datasette
    if [ $GITHUB_REF = "dev" ]; then
        gcloud config set run/region us-central1
-        source ~/devtools/datasette/publish.sh
+        python ~/devtools/datasette/publish.py


How can we be notified if this script fails so we don't run into the scenario where we're unaware of many datasette deployment failures?

Ah, since the notify_slack happens before we even try to publish - yeah, that's a good point, though I'm not sure if we had this functionality in the old world.

Do you think the following diff would work?

diff --git a/docker/gcp_pudl_etl.sh b/docker/gcp_pudl_etl.sh index 5e0929b33..0fe5c26fe 100644 --- a/docker/gcp_pudl_etl.sh +++ b/docker/gcp_pudl_etl.sh @@ -85,10 +85,8 @@ function notify_slack() { # 2>&1 redirects stderr to stdout. run_pudl_etl 2>&1 | tee $LOGFILE -# Notify slack if the etl succeeded. +# if pipeline is successful, distribute + publish datasette if [[ ${PIPESTATUS[0]} == 0 ]]; then - notify_slack "success" - # Dump outputs to s3 bucket if branch is dev or build was triggered by a tag if [ $GITHUB_ACTION_TRIGGER = "push" ] || [ $GITHUB_REF = "dev" ]; then copy_outputs_to_distribution_bucket @@ -99,8 +97,13 @@ if [[ ${PIPESTATUS[0]} == 0 ]]; then gcloud config set run/region us-central1 python ~/devtools/datasette/publish.py fi -else - notify_slack "failure" fi +# Notify slack about entire pipeline's success or failure; +# PIPESTATUS[0] either refers to the failed ETL run or the last distribution +# task that was run above +if [[ ${PIPESTATUS[0]} == 0 ]]; then notify_slack "success" else notify_slack + "failure" fi + + shutdown_vm

Annoyingly, the most straightforward way I can imagine testing it is "include assert False at the end of publish.py and wait for the nightly build to fail there" - any ideas?

To test it out on this branch, you could run the fast ETL, skip the tests and include assert False at the end of publish.py.

That logic looks correct! Any suggestions for how to capture the logs for the distribute + publish datasette steps? With your current proposal if, the ETL will succeed and the datasette deployment fails, we'll get a notification that something failed but won't find any failures in the logs.

I captured the logs by tee -a $LOGFILE so that the publish logs just end up at the end of the pudl-etl.log. This worked for debugging a failed deploy just now so I think that's good enough :)

Love it thank you!

devtools/datasette/fly/run.sh

bendnorman · 2023-11-07T18:27:51Z

Just a reminder to open a PR for the .github/workflows/build-deploy-pudl.yml in main once these changes are in dev.

…or iteration speed.

jdangerx requested a review from bendnorman November 6, 2023 17:46

bendnorman reviewed Nov 7, 2023

View reviewed changes

jdangerx requested a review from bendnorman November 7, 2023 15:58

jdangerx added 5 commits November 8, 2023 10:39

Try using datasette publish fly

c02a989

Pull run command into its own shell script; only deploy one dataset f…

6671313

…or iteration speed.

Update docs, deploy all datasets again.

935b2f6

Typos/slip-ups

45f0bb8

Update docs + tweak gcs_pudl_etl.sh

da9367e

jdangerx force-pushed the fly-io branch 2 times, most recently from f735653 to c49bfcb Compare November 8, 2023 17:08

bendnorman mentioned this pull request Nov 8, 2023

Improve nightly build notifications #2991

Closed

8 tasks

Append publishing logs to the logfile as well

a4eb7ac

jdangerx force-pushed the fly-io branch from c49bfcb to a4eb7ac Compare November 8, 2023 20:20

bendnorman approved these changes Nov 8, 2023

View reviewed changes

jdangerx merged commit d8512b5 into dev Nov 9, 2023
9 checks passed

jdangerx deleted the fly-io branch November 9, 2023 15:16

bendnorman mentioned this pull request Nov 18, 2023

Fix full build notification logic #3058

Merged

8 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Deploy Datasette to fly.io instead of Cloud Run #3018

Deploy Datasette to fly.io instead of Cloud Run #3018

jdangerx commented Nov 6, 2023 •

edited

Loading

codecov bot commented Nov 6, 2023 •

edited

Loading

bendnorman left a comment

bendnorman Nov 7, 2023

jdangerx Nov 7, 2023

bendnorman Nov 7, 2023

jdangerx Nov 8, 2023

bendnorman Nov 8, 2023

bendnorman commented Nov 7, 2023

Deploy Datasette to fly.io instead of Cloud Run #3018

Deploy Datasette to fly.io instead of Cloud Run #3018

Conversation

jdangerx commented Nov 6, 2023 • edited Loading

codecov bot commented Nov 6, 2023 • edited Loading

Codecov Report

bendnorman left a comment

Choose a reason for hiding this comment

bendnorman Nov 7, 2023

Choose a reason for hiding this comment

jdangerx Nov 7, 2023

Choose a reason for hiding this comment

bendnorman Nov 7, 2023

Choose a reason for hiding this comment

jdangerx Nov 8, 2023

Choose a reason for hiding this comment

bendnorman Nov 8, 2023

Choose a reason for hiding this comment

bendnorman commented Nov 7, 2023

jdangerx commented Nov 6, 2023 •

edited

Loading

codecov bot commented Nov 6, 2023 •

edited

Loading