Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add importorskip #1847

Merged
merged 3 commits into from
Dec 23, 2022
Merged

add importorskip #1847

merged 3 commits into from
Dec 23, 2022

Conversation

dakinggg
Copy link
Contributor

@dakinggg dakinggg commented Dec 23, 2022

What does this PR do?

Fixes tests that started failing in nightly (and then on this PR too)

What issue(s) does this change relate to?

Before submitting

  • Have you read the contributor guidelines?
  • Is this change a documentation change or typo fix? If so, skip the rest of this checklist.
  • Was this change discussed/approved in a GitHub issue first? It is much more likely to be merged if so.
  • Did you update any related docs and document your change?
  • Did you update any related tests and add any new tests related to your change? (see testing)
  • Did you run the tests locally to make sure they pass?
  • Did you run pre-commit on your change? (see the pre-commit section of prerequisites)

@dakinggg dakinggg enabled auto-merge (squash) December 23, 2022 01:14
@dakinggg dakinggg requested review from dskhudia and bcui19 December 23, 2022 19:00
@dskhudia
Copy link
Contributor

Any idea why they started failing in nightly?

@dakinggg dakinggg merged commit 2ff5300 into mosaicml:dev Dec 23, 2022
@bcui19
Copy link
Contributor

bcui19 commented Dec 23, 2022

Is there a reason why adding the importskip would fix nightly tests?

@dakinggg
Copy link
Contributor Author

dakinggg commented Dec 23, 2022

For the transformers one: It is because Mihir hasn't moved the nightly tests to GHA yet, and the lint check in GHA is installing all not dev. So there is currently no check that runs on PRs that install dev target (but the nightly lint check still runs on dev install target).

For the precision/memory test: This test has been super flaky, sometimes failing because it does not meet the memory difference threshold, sometimes failing with UserWarning: Deallocating Tensor that still has live PyObject references. This probably happened because you took out a weak reference to Tensor and didn't call _fix_weakref() after dereferencing it. Subsequent accesses to this tensor via the PyObject will now fail. (Triggered internally at ../torch/csrc/autograd/python_variable.cpp:323.). I interpreted that error as possibly an OOM, so tried reducing the memory intensity. Not certain it will fix it permanently, and might still be flaky.

@dakinggg dakinggg deleted the fix_nightly branch December 23, 2022 21:50
bmosaicml pushed a commit to bmosaicml/composer that referenced this pull request Jan 9, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants