Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve Nightly Test runs to pass #701

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

addyess
Copy link
Contributor

@addyess addyess commented Sep 27, 2024

Overview

Resolves a number of tests issues the nightly tests were having to run effectively on arm64 and amd64 units.

Details

  • Changes to the github workflow definition
    • Use system-wide tox installed through apt for arm64 workers to use
    • Correctly upload the right TEST_SNAP after its downloaded
    • Include inspection reports in failures
    • Add tmate session for remote debugging
  • Add support for arm64 in the etcd tests (correct release url, correct systemd file)
  • Address pytest syntax issue for use of pytest.mark.xfail
  • Adjust the testing tox.ini to be tox 3 and 4 compatible

@addyess addyess requested a review from a team as a code owner September 27, 2024 03:55
@addyess addyess force-pushed the KU-1609/repair-nightly-test branch 12 times, most recently from ad8c14b to 6172383 Compare September 27, 2024 14:22
@pytest.mark.xfail("cilium failures are blocking this from working")
@pytest.mark.xfail(reason="cilium failures are blocking this from working")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

repairs a syntax error on this xfail call

Comment on lines +33 to +38
@cached_property
def arch(self) -> str:
return self.exec(
["dpkg", "--print-architecture"], text=True, capture_output=True
).stdout.strip()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yay! an instance can know what architecture its running on

@@ -1,47 +1,46 @@
[tox]
no_package = True
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All these changes are to make this more compatible with tox 3. Why tox 3? Because that's what you get with apt on focal and jammy. Why not pip install it? Arm64 runners were having severe trouble loading python libraries from a virtual env on the GH runners for reasons i never was able to understand. I think it's a combination of running tox with sg but i'm not sure. LD_PYTHON_PATH got lost though -- using the system-wide tox improves this

@addyess addyess changed the title Use the correct snap when upgrading Improve Nightly Test runs to pass Sep 27, 2024
@@ -3,47 +3,43 @@ name: Nightly Latest/Edge Tests
on:
schedule:
- cron: '0 0 * * *' # Runs every midnight
pull_request:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚨 Don't merge with this in there -- its just for testing this PR 🚨

Copy link
Contributor

@bschimke95 bschimke95 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fixes. I added some comments.

Btw, just a general comment : We eventually need to refactor this nightly test in the future since e.g. we have release independent tests (upgrade tests) and release dependent tests with various flavours.

with:
name: ${{ env.artifact_name }}
path: ${{ github.workspace }}/inspection-reports.tar.gz
- name: Tmate debugging session
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMHO this will just extend the failed test for 10min. Nobody is looking at the nightly tests until they fail. The inspection report should be sufficient to reproduce the issue, if not, we need to isolate the failed test and add the tmate session as part of a draft/test PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep, we can take it out. I had it in there b/c i was tinkering and trying to learn why things failed but weren't in the inspection reports

tests/integration/templates/etcd/etcd.service Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants