Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add lxd gpu passthrough tests (New) #1535

Closed
wants to merge 39 commits into from
Closed

Conversation

pedro-avalos
Copy link
Collaborator

@pedro-avalos pedro-avalos commented Oct 8, 2024

Description

  • Refactored virtualization.py to reduce code duplication
  • Added LXD container and VM tests for GPU passthrough setups.

Resolved issues

Documentation

n/a

Tests

NVIDIA tests run on Luma.
AMD tests run on my personal device (TODO: find a device to test these on)

@pedro-avalos pedro-avalos added the enhancement New feature or request label Oct 8, 2024
@pedro-avalos
Copy link
Collaborator Author

My list of things I still need to do before this can be merged:

  1. Find a machine with an AMD GPU to test on that is not my personal machine
  2. Resolve any merge conflicts with Create a test for SRIOV Intel NIC's (New) #1293
  3. Create Checkbox jobs that use the new tests with the correct GPU vendor set
  4. Rebase

There may be other items to still work on, but that's what I can think of at the moment

@pedro-avalos pedro-avalos marked this pull request as ready for review October 9, 2024 21:02
@pedro-avalos
Copy link
Collaborator Author

I've added the test plan and jobs, but for some reason, the template doesn't find any jobs to run. Any ideas on what is going on?

@fernando79513 fernando79513 self-assigned this Oct 10, 2024
Copy link

codecov bot commented Oct 16, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 57.14%. Comparing base (37dcd06) to head (e3571a4).
Report is 16 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1535      +/-   ##
==========================================
+ Coverage   47.76%   57.14%   +9.37%     
==========================================
  Files         370        1     -369     
  Lines       39750       42   -39708     
  Branches     6720        6    -6714     
==========================================
- Hits        18987       24   -18963     
+ Misses      20048       18   -20030     
+ Partials      715        0     -715     
Flag Coverage Δ
checkbox-ng ?
checkbox-support ?
contrib-provider-ce-oem ?
provider-base ?
provider-certification-client ?
provider-certification-server ?
provider-genio ?
provider-gpgpu 57.14% <ø> (ø)
provider-iiotg ?
provider-resource ?
provider-sru ?
release-tools ?

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Will add the LXD VM test too, but I may refactor the code because I see a lot of code duplication between the LXDTest and LXDTest_vm classes...
I've also updated the coverage tests to pass with the new way the classes work.
Simplifies running commands on the LXD guest.
Seems like overhead on Luma made the tests take much longer on average.
Refactored configuration step, extracting fucntions. This allows for easier overrides in the VM test.
CUDA toolkit failed to install on a VM due to lack of storage.
@pedro-avalos
Copy link
Collaborator Author

@fernando79513 Thinking this over, this PR could probably be closed and reopened. I don't see any hard reason why the GPU tests should live in the virtualization.py script. It is making the file extremely long and complicated. I think I should make a new PR where I make a separate script in the gpgpu provider.

The only things I was using from this script were the run command wrapper, but that isn't a strong enough reason in my opinion.

Let me know what you think. I just want to make your life easier with reviewing PRs.

@pedro-avalos
Copy link
Collaborator Author

Closing this PR, the PR is way too large, and as discussed, it will be extracted to its own script. We will need to revisit refactoring virtualization.py at some point, though.

@pedro-avalos pedro-avalos deleted the add-lxd-vgpu-tests branch November 4, 2024 22:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants