Disable flaky filecoin test for now #419

hannahhoward · 2024-08-30T01:07:28Z

Goals

currently, seed.run PRs fail 90%+ of the time because of the Filecoin test.

until we have a more reliable way to test the filecoin flow, the PR disables it.

the inserted comment suggests a few solutions -- either spinning up custom infra for w3filecoin pipeline on each pr to get more reliable results, or mocking out the pipeline

either way, this is the temporary fix -- I don't want to put in extra retries or anything -- we just need to find a better way to test, or investigate if there are in fact underlying issues, but in the meantime, other PRs need to get merged.

seed-deploy · 2024-08-30T01:26:41Z

View stack outputs

pr419-w3infra-BillingDbStack

Name Value

customerTableName pr419-w3infra-customer

spaceDiffTableName pr419-w3infra-space-diff

spaceSnapshotTableName pr419-w3infra-space-snapshot

usageTable pr419-w3infra-usage

pr419-w3infra-BillingStack

Name	Value
ApiEndpoint	https://a4l2dy0108.execute-api.us-east-2.amazonaws.com
billingCronHandlerURL	https://fv2uhnr3nht6vltfghout4n5xe0gdnew.lambda-url.us-east-2.on.aws/
CustomDomain	https://pr419.billing.web3.storage

pr419-w3infra-CarparkStack

Name Value

BucketName carpark-pr419-0

Region us-east-2
pr419-w3infra-RoundaboutStack

Name Value

ApiEndpoint https://a4xifqw4b7.execute-api.us-east-2.amazonaws.com

CustomDomain https://pr419.roundabout.web3.storage
pr419-w3infra-UcanInvocationStack

Name Value

invocationBucketName invocation-store-pr419-0

taskBucketName task-store-pr419-0

workflowBucketName workflow-store-pr419-0
pr419-w3infra-UploadApiStack

Name Value

ApiEndpoint https://t0j8siog0f.execute-api.us-east-2.amazonaws.com

CustomDomain https://pr419.up.web3.storage
pr419-w3infra-BusStack
pr419-w3infra-FilecoinStack
pr419-w3infra-IndexerStack
pr419-w3infra-ReplicatorStack
pr419-w3infra-UcanFirehoseStack
pr419-w3infra-UploadDbStack

currently, seed.run fails 90%+ of the time because of the Filecoin test. until we have a more reliable way to test the filecoin flow, the PR disables it. the inserted comment suggests a few solutions -- either spinning up custom infra for w3filecoin pipeline on each pr to get more reliable results, or mocking out the pipeline

alanshaw

It loathes me to do this. The staging environment is the closest thing we have to production and the last place we can test end to end interactions across the whole system.

The flakiness needs to be fixed, not ignored. Oftentimes it is the test code that is to blame but sometimes it's an actual bug.

(FWIW I did not write these tests and it frustrates me that not enough care and attention was made to ensure they are robust.)

If the staging environment itself has caused the tests to fail (changes to the system that are not under the control of this repo) then it needs to be fixed. If this is the case and we're disabling these tests because of it then we're effectively saying our staging environment is broken and we cannot use it and we are completely blind when deploying changes to know if the entire system works together.

I understand there is an inherent problem here that we cannot deploy the entire system from a single repo. Unfortunately it's an artefact of history (and also an artefact of deployment across multiple cloud providers to optimise for cost) that I'd like to change in the future.

Peeja · 2024-09-03T14:29:21Z

I'm with @alanshaw's sentiments. I think if we have a test failure, that should be a blocker: either we confirm that the test is unnecessary, or we fix it. To do otherwise degrades our confidence in our continuous delivery, and the whole thing falls apart.

That said, it doesn't need to block everyone. We can operate with slightly less confidence for a limited time.

I recommend we merge this and escalate the card that resolves, so that one person is blocked on it while everyone else can move forward with slightly more caution.

Peeja · 2024-09-03T14:43:24Z

Lint was failing due to unused things. I've removed them.
Hilariously, Seed was failing with a different error. Hopefully it passes this time.

alanshaw · 2024-09-04T09:54:38Z

That said, it doesn't need to block everyone. We can operate with slightly less confidence for a limited time.

I recommend we merge this and escalate the card that resolves, so that one person is blocked on it while everyone else can move forward with slightly more caution.

+1 I should have cross posted here but I did suggest this same resolution in Discord.

Peeja · 2024-09-06T14:44:14Z

@alanshaw Added to top of sprint as storacha/project-tracking#129. Good to merge this PR?

**Before Merge:** * [ ] Set `HOSTED_ZONES` (plural) on `prod`. * [ ] Merge #419 (which this currently includes to get tests to pass). --- With `NEXT_PUBLIC_W3UP_SERVICE_URL=https://petra.up.storacha.network` in my local `console`: ![CleanShot 2024-08-19 at 17 33 15](https://github.com/user-attachments/assets/66aa54da-dbf6-47f1-9b1d-a151e9546b9f) Then… ![CleanShot 2024-08-19 at 17 34 12](https://github.com/user-attachments/assets/03141163-ec6f-447b-bdee-48e3d0ab8abc) And success! ![CleanShot 2024-08-19 at 17 33 55](https://github.com/user-attachments/assets/fe88b106-daa3-43dd-86ec-a71516bf1c0d) Or, failure. ![CleanShot 2024-08-19 at 17 34 19](https://github.com/user-attachments/assets/5a8dd394-0f90-429d-a3de-dc6ca9ea3819) Note that that error message is a dummy message in the test harness, not something real. Also, I see to have bumped something that made the Stripe section always appear. Previously, staging/prod was set not to show it by not providing the right env vars. I'm not sure why that changed, and whether we want it to. Closes storacha/project-tracking#119 --------- Co-authored-by: hannahhoward <hannah@hannahhoward.net>

hannahhoward requested a review from alanshaw August 30, 2024 01:07

seed-deploy bot temporarily deployed to pr419 August 30, 2024 01:07 Inactive

hannahhoward requested a review from Peeja August 30, 2024 01:07

hannahhoward force-pushed the fix/disable-filecoin-test-for-now branch from d106204 to 5a0031e Compare August 30, 2024 01:14

seed-deploy bot temporarily deployed to pr419 August 30, 2024 01:26 Inactive

hannahhoward force-pushed the fix/disable-filecoin-test-for-now branch from 5a0031e to 346fd91 Compare August 30, 2024 02:08

seed-deploy bot temporarily deployed to pr419 August 30, 2024 02:08 Inactive

seed-deploy bot had a problem deploying to pr418 August 30, 2024 03:06 Failure

hannahhoward force-pushed the fix/disable-filecoin-test-for-now branch from 346fd91 to 20f60f7 Compare August 30, 2024 03:09

seed-deploy bot had a problem deploying to pr419 August 30, 2024 03:10 Failure

hannahhoward force-pushed the fix/disable-filecoin-test-for-now branch from 20f60f7 to 346fd91 Compare August 30, 2024 03:10

seed-deploy bot temporarily deployed to pr419 August 30, 2024 03:29 Inactive

hannahhoward force-pushed the fix/disable-filecoin-test-for-now branch from 346fd91 to 13462fa Compare August 31, 2024 01:25

seed-deploy bot had a problem deploying to pr419 August 31, 2024 01:25 Failure

alanshaw requested changes Aug 31, 2024

View reviewed changes

seed-deploy bot temporarily deployed to pr419 September 3, 2024 14:41 Inactive

Remove unused import & declaration

cc3e204

Peeja force-pushed the fix/disable-filecoin-test-for-now branch from dd2d5e6 to cc3e204 Compare September 3, 2024 14:43

Peeja approved these changes Sep 3, 2024

View reviewed changes

seed-deploy bot temporarily deployed to pr419 September 3, 2024 14:52 Inactive

Peeja mentioned this pull request Sep 6, 2024

Solve flaky test and re-enable them storacha/project-tracking#129

Open

alanshaw approved these changes Sep 6, 2024

View reviewed changes

hannahhoward merged commit 53cf7cf into main Sep 19, 2024
3 checks passed

hannahhoward deleted the fix/disable-filecoin-test-for-now branch September 19, 2024 19:17

Peeja mentioned this pull request Sep 6, 2024

Storacha-branded validation flow #417

Merged

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Disable flaky filecoin test for now #419

Disable flaky filecoin test for now #419

hannahhoward commented Aug 30, 2024

seed-deploy bot commented Aug 30, 2024

alanshaw left a comment

Peeja commented Sep 3, 2024

Peeja commented Sep 3, 2024

alanshaw commented Sep 4, 2024

Peeja commented Sep 6, 2024

Disable flaky filecoin test for now #419

Disable flaky filecoin test for now #419

Conversation

hannahhoward commented Aug 30, 2024

Goals

seed-deploy bot commented Aug 30, 2024

alanshaw left a comment

Choose a reason for hiding this comment

Peeja commented Sep 3, 2024

Peeja commented Sep 3, 2024

alanshaw commented Sep 4, 2024

Peeja commented Sep 6, 2024