Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disable flaky filecoin test for now #419

Merged
merged 2 commits into from
Sep 19, 2024

Conversation

hannahhoward
Copy link
Member

Goals

currently, seed.run PRs fail 90%+ of the time because of the Filecoin test.

until we have a more reliable way to test the filecoin flow, the PR disables it.

the inserted comment suggests a few solutions -- either spinning up custom infra for w3filecoin pipeline on each pr to get more reliable results, or mocking out the pipeline

either way, this is the temporary fix -- I don't want to put in extra retries or anything -- we just need to find a better way to test, or investigate if there are in fact underlying issues, but in the meantime, other PRs need to get merged.

Copy link

seed-deploy bot commented Aug 30, 2024

View stack outputs

currently, seed.run fails 90%+ of the time because of the Filecoin test.

until we have a more reliable way to test the filecoin flow, the PR disables it.

the inserted comment suggests a few solutions -- either spinning up
custom infra for w3filecoin pipeline on each pr to get more reliable
results, or mocking out the pipeline
Copy link
Member

@alanshaw alanshaw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It loathes me to do this. The staging environment is the closest thing we have to production and the last place we can test end to end interactions across the whole system.

The flakiness needs to be fixed, not ignored. Oftentimes it is the test code that is to blame but sometimes it's an actual bug.

(FWIW I did not write these tests and it frustrates me that not enough care and attention was made to ensure they are robust.)

If the staging environment itself has caused the tests to fail (changes to the system that are not under the control of this repo) then it needs to be fixed. If this is the case and we're disabling these tests because of it then we're effectively saying our staging environment is broken and we cannot use it and we are completely blind when deploying changes to know if the entire system works together.

I understand there is an inherent problem here that we cannot deploy the entire system from a single repo. Unfortunately it's an artefact of history (and also an artefact of deployment across multiple cloud providers to optimise for cost) that I'd like to change in the future.

@Peeja
Copy link
Member

Peeja commented Sep 3, 2024

I'm with @alanshaw's sentiments. I think if we have a test failure, that should be a blocker: either we confirm that the test is unnecessary, or we fix it. To do otherwise degrades our confidence in our continuous delivery, and the whole thing falls apart.

That said, it doesn't need to block everyone. We can operate with slightly less confidence for a limited time.

I recommend we merge this and escalate the card that resolves, so that one person is blocked on it while everyone else can move forward with slightly more caution.

@Peeja Peeja force-pushed the fix/disable-filecoin-test-for-now branch from dd2d5e6 to cc3e204 Compare September 3, 2024 14:43
@Peeja
Copy link
Member

Peeja commented Sep 3, 2024

  • Lint was failing due to unused things. I've removed them.
  • Hilariously, Seed was failing with a different error. Hopefully it passes this time.

@seed-deploy seed-deploy bot temporarily deployed to pr419 September 3, 2024 14:52 Inactive
@alanshaw
Copy link
Member

alanshaw commented Sep 4, 2024

That said, it doesn't need to block everyone. We can operate with slightly less confidence for a limited time.

I recommend we merge this and escalate the card that resolves, so that one person is blocked on it while everyone else can move forward with slightly more caution.

+1 I should have cross posted here but I did suggest this same resolution in Discord.

@Peeja
Copy link
Member

Peeja commented Sep 6, 2024

@alanshaw Added to top of sprint as storacha/project-tracking#129. Good to merge this PR?

@hannahhoward hannahhoward merged commit 53cf7cf into main Sep 19, 2024
3 checks passed
@hannahhoward hannahhoward deleted the fix/disable-filecoin-test-for-now branch September 19, 2024 19:17
hannahhoward added a commit that referenced this pull request Sep 19, 2024
**Before Merge:**

* [ ] Set `HOSTED_ZONES` (plural) on `prod`.
* [ ] Merge #419 (which
this currently includes to get tests to pass).

---

With `NEXT_PUBLIC_W3UP_SERVICE_URL=https://petra.up.storacha.network` in
my local `console`:

![CleanShot 2024-08-19 at 17 33
15](https://github.com/user-attachments/assets/66aa54da-dbf6-47f1-9b1d-a151e9546b9f)

Then…

![CleanShot 2024-08-19 at 17 34
12](https://github.com/user-attachments/assets/03141163-ec6f-447b-bdee-48e3d0ab8abc)

And success!

![CleanShot 2024-08-19 at 17 33
55](https://github.com/user-attachments/assets/fe88b106-daa3-43dd-86ec-a71516bf1c0d)

Or, failure.

![CleanShot 2024-08-19 at 17 34
19](https://github.com/user-attachments/assets/5a8dd394-0f90-429d-a3de-dc6ca9ea3819)

Note that that error message is a dummy message in the test harness, not
something real. Also, I see to have bumped something that made the
Stripe section always appear. Previously, staging/prod was set not to
show it by not providing the right env vars. I'm not sure why that
changed, and whether we want it to.

Closes storacha/project-tracking#119

---------

Co-authored-by: hannahhoward <hannah@hannahhoward.net>
@Peeja Peeja mentioned this pull request Sep 6, 2024
2 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants