Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test: review skipped tests (networking timeouts) #1027

Merged
merged 15 commits into from
Nov 16, 2023
Merged

test: review skipped tests (networking timeouts) #1027

merged 15 commits into from
Nov 16, 2023

Conversation

lobis
Copy link
Collaborator

@lobis lobis commented Nov 15, 2023

In this PR I review the skipped tests due to "flakiness". They are all xrootd tests.

After removing the skip some of them failed for various reasons, I made changes so tests pass. It's not the cleanest fix but this xrootd source is going to be replaced by fsspec (atleast for the default behaviour).

Ocasionaly some test will fail due to server issues, we should retry those. I think raising a specific error when this happens and configuring pytest to retry on this error is the way to go (this is what we done at the moment, but it doesn't work since we don't see requests errors). I updated the http source to throw a http.client.HTTPException and xrootd to throw a TimeoutError. This currently works for the ocasional xrootd and s3 fails.

Probably these errors can be improved (not sure if it's correct to throw a TimeoutError). Before this changes we were throwing an OSError which imho is too generic to do retries on.

@pytest.mark.network
@pytest.mark.xrootd
@pytest.mark.parametrize("use_threads", [True, False], indirect=True)
def test_xrootd_deadlock(use_threads):
def test_xrootd_deadlock():
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really sure of what I did here, don't really understand this test. Please review @nsmith- in case I broke something important.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this test is obsolete now that XRootD upstream has truly fixed the issues discussed in #59 and our CI is not testing older xrootd that needs the patch of

if older_xrootd("5.1.0"):
# This is registered after calling "import XRootD.client" so it is ran
# before XRootD.client.finalize.finalize()
@atexit.register
def cleanup_open_files():

In principle we could in https://github.com/scikit-hep/uproot5/blob/6b6fa9458b4ae46894e053dee25c56f678567f9d/.github/workflows/build-test.yml#L62C11-L62C11 specify older xrootd to test but it seems a bit moot.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps @chrisburr has input

@lobis lobis changed the title test: remove skip from flaky tests, update retry logic test: review skipped tests (networking timeouts) Nov 16, 2023
@lobis lobis marked this pull request as ready for review November 16, 2023 01:39
Copy link
Member

@jpivarski jpivarski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's awesome! I've been meaning to get back to tests that were excluded for "flakiness" (also in Awkward Array; some tests were turned off and labeled with "FIXMEs").

I've checked all of the fixes, and it all looks good to me. I'm going to leave this to @nsmith- because you have a question for him inline.

@lobis lobis mentioned this pull request Nov 16, 2023
@pytest.mark.network
@pytest.mark.xrootd
@pytest.mark.parametrize("use_threads", [True, False], indirect=True)
def test_xrootd_deadlock(use_threads):
def test_xrootd_deadlock():
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this test is obsolete now that XRootD upstream has truly fixed the issues discussed in #59 and our CI is not testing older xrootd that needs the patch of

if older_xrootd("5.1.0"):
# This is registered after calling "import XRootD.client" so it is ran
# before XRootD.client.finalize.finalize()
@atexit.register
def cleanup_open_files():

In principle we could in https://github.com/scikit-hep/uproot5/blob/6b6fa9458b4ae46894e053dee25c56f678567f9d/.github/workflows/build-test.yml#L62C11-L62C11 specify older xrootd to test but it seems a bit moot.

@lobis lobis merged commit 96fc80d into main Nov 16, 2023
20 checks passed
@lobis lobis deleted the flaky-tests branch November 16, 2023 18:51
lobis added a commit that referenced this pull request Nov 16, 2023
* return a http.client.HTTPException instead of OSError

* rerun tests on http.client.HTTPException

* remove test skip for xrootd

* rename test

* rename test

* do not capitalize variables

* correctly load default options

* timeout error

* update test (TODO: review)

* add TimeoutError to retry exceptions

* correctly initialized num_bytes

* correctly access resource

* rerun on timeout

* update retry regex

* remove outdated test
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants