Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CI for boto: fix errors; add coverage; add skip for uncatchable ResourceWarning #23731

Merged
merged 13 commits into from
Dec 15, 2018

Conversation

h-vetinari
Copy link
Contributor

@h-vetinari h-vetinari commented Nov 16, 2018

closes #23680
closes #23754

  • fixture modified / tests pass
  • passes git diff upstream/master -u -- "*.py" | flake8 --diff

EDIT2: The warning has been identified as being caused by a vendored requests from botocore<1.11, which is solved by raising the minimum version to 1.11 for the only CI job (travis-36) that is testing boto.

This would then simultaneously run into #23754 due to a moto bug (getmoto/moto#1924 / getmoto/moto#1941), but setting the environment variables AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY to any dummy value fixes the issue (taken from getmoto/moto#1952).

I'm also adding the boto tests to the travis-37 job, just to have some more coverage in general (and the travis-37 is by far the fastest job right now).

EDIT: The warning has been identified as being caused by a vendored requests from botocore<1.11. Unfortunately, it's not possible to (just) increase the minimum version, as botocore>=1.11 currently runs into #23754 due to a moto bug (getmoto/moto#1941), which would (once #24073 is merged) that these would just be skipped silently. Thus, I'm adding a the boto tests to the travis-37 build with botocore>=1.11, which will start working once #23754 is skipped, while still testing boto on the travis-36 job by forcing botocore<1.11.

@pep8speaks
Copy link

Hello @h-vetinari! Thanks for submitting the PR.

@codecov
Copy link

codecov bot commented Nov 16, 2018

Codecov Report

Merging #23731 into master will increase coverage by 0.05%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #23731      +/-   ##
==========================================
+ Coverage   92.22%   92.28%   +0.05%     
==========================================
  Files         162      162              
  Lines       51824    51830       +6     
==========================================
+ Hits        47795    47830      +35     
+ Misses       4029     4000      -29
Flag Coverage Δ
#multiple 90.68% <100%> (+0.06%) ⬆️
#single 43.01% <16.66%> (-0.01%) ⬇️
Impacted Files Coverage Δ
pandas/util/testing.py 87.48% <100%> (+0.06%) ⬆️
pandas/io/common.py 72.86% <0%> (+0.77%) ⬆️
pandas/io/parquet.py 84.61% <0%> (+7.69%) ⬆️
pandas/io/s3.py 86.36% <0%> (+86.36%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 7b0fa8e...b532696. Read the comment docs.

@@ -425,6 +426,13 @@ def __init__(self, io, **kwds):
raise ValueError('Must explicitly set engine if not passing in'
' buffer or path for io.')

if should_close:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you move this to a function in pandas.io.common, call it

maybe_close_filepath(should_close, io) to make this code more conscise

@jreback jreback added the IO Data IO issues that don't fit into a more specific label label Nov 16, 2018
@h-vetinari
Copy link
Contributor Author

h-vetinari commented Nov 16, 2018

After an absurd amount of time trying to hunt down these warnings, I think I found the culprit/solution boto/botocore#1464.

The warning is from a vendored requests/urrlib3 in botocore, which didn't close a session/socket. Unfortunately, there are no means that I found (and I tried a lot) that can catch this warning. Failed attempts include:

  • warnings.catch_warnings() with simplefilter or filterwarning
  • capsys/capfd/capsysbinary/capsysfd fixtures from pytest
  • tm.capture_stderr and tm.capture_stdout
  • setting os.environ["PYTHONWARNINGS"]
  • passing -W ignore::ResourceWarning to the pytest call

The only thing that had an effect (but still didn't work) was -W error::ResourceWarning:

(pandas-dev) C:\[...]\pddev>pytest pandas/tests/io/test_parquet.py -W error::ResourceWarning
============================= test session starts =============================
[...]
========== 1 failed, 39 passed, 6 skipped, 2 xpassed in 6.35 seconds ==========
Exception ignored in: <socket.socket fd=2680, family=AddressFamily.AF_INET, type=SocketKind.SOCK_DGRAM, proto=0>
ResourceWarning: unclosed <socket.socket fd=2680, family=AddressFamily.AF_INET, type=SocketKind.SOCK_DGRAM, proto=0>

i.e. even more spurious output (the failure is platform-specific and not worth mentioning here).

After I upgraded to the latest botocore (>=1.11 is the cutoff), things are working fine. As such, I decided to just (try to) force the travis-36 job to load that, and skip the s3-tests otherwise. I restricted the skips to the case that PANDAS_TESTING_MODE = "deprecate", because otherwise, the ResourceWarnings are filtered out anyway.

@h-vetinari
Copy link
Contributor Author

h-vetinari commented Nov 17, 2018

So, the last run had no warnings anymore (failure was again only #23726), but I realized that the reason for this is that now the s3_resource is just skipping through some more restrictive authentication checks in the newer boto (#23754).

@h-vetinari
Copy link
Contributor Author

I'm trying to cover more boto by adding it to the travis-37 job (which is also around 10min faster than the others). Moto is supporting 3.7. starting with 1.3.7 (the latest version), but that isn't reflected in the requirements yet: getmoto/moto#1886. Therefore, I'm installing through pip (after failing through conda).

@h-vetinari
Copy link
Contributor Author

OK, good news is that the ResourceWarnings are gone from the travis-36 job (https://travis-ci.org/pandas-dev/pandas/jobs/456410643), which, I guess is a nice and closed-off extent for this PR.

The errors now in the travis-37 job due to #23754 will still need to be solved, but I guess that's something for a follow-up.

@@ -37,6 +41,14 @@ def s3_resource(tips_file, jsonl_file):
"""
pytest.importorskip('s3fs')
boto3 = pytest.importorskip('boto3')
botocore = pytest.importorskip('botocore')
if (LooseVersion(botocore.__version__) < LooseVersion("1.11.0")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just make the minimum in the tests 1.11 then u don’t need this at all

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean add it to the travis-36 dependencies? That would currently fail due to #23754

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding botocore>=1.11 to the dependencies will mean either failures due to #23754 (which is most likely an upstream moto-bug), or that none of the boto tests are actually run (because they'd be skipped). The travis-36 build is the only build testing boto.

With this construct (and I admit it's not pretty), we could have one build doing botocore<1.11 (actually testing the code), and one with botocore>=1.11, which would be silently skipping them now but will start working again as soon as the moto bug is fixed and a new version available.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't been following the boto / moto issues closely, but it seems like this is the best option for now if we want to have any of the boto stuff actually tested.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

botocore is only a test dep, so i don't mind switching it to a higher version. Then simply add this to other builds until we are actually testing this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From what I understand there's a conflict between the latest boto/botocore and moto
#23754

We could also try to fix this at the PANDAS_TESTING_MODE level. That adds a warnings.simplefilter at the start of the test. Perhaps the fixture could add an ignore to our filters? Though maybe @h-vetinari already tried that.


with tm.ensure_clean() as path:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this was mocked before, why are you now non-mocking it? this causes permission issues.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it's still mocked, because the function now uses the s3_resource fixture, which does the mocking.

So now, all s3 tests use the fixture instead of doing their own mocking.

url_table = read_excel(url)
local_table = self.get_exceldf('test1', ext)
tm.assert_frame_equal(url_table, local_table)
def test_read_from_s3_url(self, ext, s3_resource):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same why are you not mocking this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jreback, same as above, the mocking is done in the s3_resource fixture that I added to the sig

@h-vetinari h-vetinari changed the title TST/CLN: fix sys1:ResourceWarning due to open sockets (WIP) TST/CLN: fix sys1:ResourceWarning due to open sockets Nov 18, 2018
@h-vetinari
Copy link
Contributor Author

@jreback
I responded to your feedback couple days ago, PTAL

Copy link
Contributor Author

@h-vetinari h-vetinari left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jreback @TomAugspurger
Currently, boto is only tested in one build (and even those would be skipped if I incorporated your feedback). Please see my alternate suggestion in the comment.

@@ -37,6 +41,14 @@ def s3_resource(tips_file, jsonl_file):
"""
pytest.importorskip('s3fs')
boto3 = pytest.importorskip('boto3')
botocore = pytest.importorskip('botocore')
if (LooseVersion(botocore.__version__) < LooseVersion("1.11.0")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding botocore>=1.11 to the dependencies will mean either failures due to #23754 (which is most likely an upstream moto-bug), or that none of the boto tests are actually run (because they'd be skipped). The travis-36 build is the only build testing boto.

With this construct (and I admit it's not pretty), we could have one build doing botocore<1.11 (actually testing the code), and one with botocore>=1.11, which would be silently skipping them now but will start working again as soon as the moto bug is fixed and a new version available.

@h-vetinari
Copy link
Contributor Author

@TomAugspurger Would you mind opining on #23731 (comment)? :)

@h-vetinari
Copy link
Contributor Author

@TomAugspurger Thanks!

@jreback Should you agree as well, don't merge quite yet - I still need to set up boto to be tested in another CI job (as explained in the comment above). Will wait for your input here

@TomAugspurger
Copy link
Contributor

What do you mean by "another" CI job? Can we take an existing one and pin moto, boto, and botocore to known versions?

@h-vetinari
Copy link
Contributor Author

h-vetinari commented Nov 29, 2018 via email

@h-vetinari
Copy link
Contributor Author

h-vetinari commented Nov 29, 2018

@TomAugspurger

From what I understand there's a conflict between the latest boto/botocore and moto
#23754

Yes, there seems to be an error with the newer moto that will hopefully be fixed soon.

We could also try to fix this at the PANDAS_TESTING_MODE level. That adds a warnings.simplefilter at the start of the test. Perhaps the fixture could add an ignore to our filters? Though maybe @h-vetinari already tried that.

The warning unfortunately cannot be caught by warnings.simplefilter (or anything else I could try, see #23731 (comment)), because AFAICT, it is emitted from a finally/teardown state, where the usual mechanics don't apply anymore.

@jreback

botocore is only a test dep, so i don't mind switching it to a higher version. Then simply add this to other builds until we are actually testing this.

It's actually an indirect (optional) dependency through boto3, which directly depends on it (in version lockstep: boto3 1.x.yy <-> botocore 1.[x+3].yy).

I added the newer moto to to the travis-37 job, where those boto tests should now produce errors (after re-verifying that, I'll then add a specific except to the fixture-teardown). I'll also add PANDAS_TESTING_MODE="deprecate" to make sure these boto-tests are tested for warnings (once moto is fixed upstream).

On the other hand, I'm forcing botocore<1.11 on the travis-36 job (and removing PANDAS_TESTING_MODE="deprecate"), to make sure boto is tested until #23754 is solved.

EDIT: clarification about shifting PANDAS_TESTING_MODE="deprecate"

@h-vetinari h-vetinari changed the title TST/CLN: fix sys1:ResourceWarning due to open sockets TST/CLN/CI/DEP: use boto-fixture consistently, enable boto tests on travis-37; fix uncatchable ResourceWarning Nov 30, 2018
@jreback
Copy link
Contributor

jreback commented Dec 2, 2018

can you rebase

@h-vetinari
Copy link
Contributor Author

Failure in azure is unrelated.

@h-vetinari h-vetinari closed this Dec 3, 2018
@h-vetinari h-vetinari reopened this Dec 3, 2018
@h-vetinari
Copy link
Contributor Author

h-vetinari commented Dec 3, 2018

I had closed this to avoid being merged, as I saw that there were some things that still need ironing out (but had no time to comment at work).
I'll split off the part that deals with the s3_resource fixture in a separate PR and leave the CI stuff here.

@h-vetinari
Copy link
Contributor Author

@TomAugspurger
5979389 seems to have fixed the moto import issue.

Copy link
Contributor Author

@h-vetinari h-vetinari left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jreback @TomAugspurger
Right now, this fixes two moto issues (the import error that was hacked around by @TomAugspurger in #24092, and #23754), but the ResourceWarnings are back for some reasons (despite the newest boto/moto):
travis-37: https://travis-ci.org/pandas-dev/pandas/jobs/465618259
travis-36: https://travis-ci.org/pandas-dev/pandas/jobs/465618262

I can split off another PR or rename this one, but at the moment, boto tests are skipped everywhere due to #24092, so I think this should be merged soon.

- hypothesis>=3.58.0
- pip:
- brotlipy
- coverage
- moto
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

conda pulls in moto 1.1.1, which is way too old.

if LooseVersion(botocore.__version__) < LooseVersion("1.11.0"):
# botocore leaks an uncatchable ResourceWarning before 1.11.0;
# see GH 23731 and https://github.com/boto/botocore/issues/1464
pytest.skip("botocore is leaking resources before 1.11.0")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually this skip is needed because travis-27 runs an older boto (I just didn't see it in the .yml because its a transitive dependency of s3fs).

@@ -37,6 +40,12 @@ def s3_resource(tips_file, jsonl_file):
"""
pytest.importorskip('s3fs')
boto3 = pytest.importorskip('boto3')

# temporary workaround as moto fails for botocore >= 1.11 otherwise
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

finally:
s3.stop()
os.environ.setdefault("AWS_ACCESS_KEY_ID", None)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is not correct, you need to reset it to what it was before. maybe just use an environment context manager here

@h-vetinari
Copy link
Contributor Author

@jreback
Now using a contextmanager like you asked. It's green too.

@h-vetinari
Copy link
Contributor Author

@jreback
This is green, and ready for review.

@jreback jreback added the CI Continuous Integration label Dec 13, 2018
Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good. 1 small addition, ping on green.

ci/deps/travis-36.yaml Show resolved Hide resolved
@jreback jreback added this to the 0.24.0 milestone Dec 13, 2018
@h-vetinari h-vetinari changed the title CI/DEP: increase boto coverage; add skip for uncatchable ResourceWarning CI for boto: fix errors; add coverage; add skip for uncatchable ResourceWarning Dec 14, 2018
@jreback
Copy link
Contributor

jreback commented Dec 14, 2018

lgtm. ping on green.

@h-vetinari
Copy link
Contributor Author

@jreback Green

@jreback jreback merged commit c128f7f into pandas-dev:master Dec 15, 2018
@jreback
Copy link
Contributor

jreback commented Dec 15, 2018

thanks @h-vetinari

@h-vetinari h-vetinari deleted the fix_resource_warn branch December 15, 2018 19:04
Pingviinituutti pushed a commit to Pingviinituutti/pandas that referenced this pull request Feb 28, 2019
Pingviinituutti pushed a commit to Pingviinituutti/pandas that referenced this pull request Feb 28, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CI Continuous Integration IO Data IO issues that don't fit into a more specific label
Projects
None yet
Development

Successfully merging this pull request may close these issues.

TST/DEPS: new boto breaks tests BLD/CI: ResourceWarnings are back (sometimes)!
4 participants