-
-
Notifications
You must be signed in to change notification settings - Fork 17.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Test Decorators and Better Pytest Integration in 'test_excel' #19829
Conversation
pandas/tests/io/test_excel.py
Outdated
@@ -1907,10 +1780,10 @@ def test_invalid_columns(self): | |||
with pytest.raises(KeyError): | |||
write_frame.to_excel(path, 'test1', columns=['C', 'D']) | |||
|
|||
def test_comment_arg(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There were a few one-off functions like this that didn't previously have the skip_if_no_xlrd
call in them However, when trying to run them without xlrd
but say xlwt
installed they would throw:
ModuleNotFoundError: No module named 'xlrd'
thrown by the ExcelFile constructor. So I'm assuming it was an oversight not to have these skipped, (they will be going forward with the class structure I put in place)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes that’s good thanks
pytest.param(_XlwtWriter, '.xls', marks=pytest.mark.skipif( | ||
not td.safe_import('xlwt'), reason='No xlwt')) | ||
]) | ||
def test_ExcelWriter_dispatch(self, klass, ext): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test was pretty wonky to begin with. With the parametrization I've split out the tests and given preference to xlsxwriter
if both that and openpyxl
are installed with .xlsx
files, which is what the original function was doing in a roundabout way
pytest.param('xlsxwriter', '.xlsx', marks=pytest.mark.skipif( | ||
not td.safe_import('xlsxwriter'), reason='No xlsxwriter')) | ||
]) | ||
class TestExcelWriter(_WriterBase): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than creating subclasses and adding arbitrary class metadata it seemed easy enough and clearer to parametrize the writing tests for each engine and the extension it should be serving
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another thing that came to mind - want me to add parametrization for .xlsm
with openpyxl
and xlsxwriter
? Neither were being tested before, but I think it makes sense to add those extensions for more coverage.
Happy to add here or in separate change if we don't want to expand the scope further
ext = '.xlsx' | ||
engine_name = 'xlrd' | ||
check_skip = staticmethod(_skip_if_no_xlrd) | ||
class _WriterBase(SharedItems): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The original ExcelWriterBase
class had an explicit setup / teardown method to set / reset options before each test case. Because I parametrized that test the setup / teardown paradigm did not have visibility into the actual fixtures being passes, so setting up an autoused fixture here to wrap each test case was the best option to inspect the parameters and set / reset the appropriate excel options before and after test execution.
I made this it's own subclass because the three classes that test specific features of openpyxl
, xlsxwriter
and xlwt
still benefit from having this fixture, although it does require them to provide some extra metadata (namely the merge_cells
fixture) that they don't use.
If this is too wonky, I think an alternate approach would be to re-wire the tests to specifically pass the engine fixture as a keyword argument to their read functions (didn't explore that option in detail so could be wrong)
@@ -103,7 +67,7 @@ def get_csv_refdf(self, basename): | |||
dfref = read_csv(pref, index_col=0, parse_dates=True, engine='python') | |||
return dfref | |||
|
|||
def get_excelfile(self, basename): | |||
def get_excelfile(self, basename, ext): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
More verbose doing this, but with the parametrization of Test Classes performed further down in the module I think it's better to explicitly pass ext
as an argument rather than creating subclasses that bind their own ext
variable
The Travis network failures seem specific to Py27 (ran fine locally on Py3 but could reproduce with Py27). Assume that something is getting crossed up between the fixtures and the decorator for that test |
So this is extremely nuanced but it looks like the combination of Py27, using Before making a change on that front I figure it's worth stopping here and getting feedback on the PR as is, but that again is one possible solution to the failures. Here's the issue I mentioned: And the "offender" in the pandas source: Line 2168 in 3b135c3
|
looks pretty reasonable |
Codecov Report
@@ Coverage Diff @@
## master #19829 +/- ##
==========================================
- Coverage 91.66% 91.65% -0.02%
==========================================
Files 150 150
Lines 48938 48949 +11
==========================================
+ Hits 44860 44865 +5
- Misses 4078 4084 +6
Continue to review full report at Codecov.
|
import xlrd | ||
|
||
# Test reading times with and without milliseconds. GH5945. | ||
if LooseVersion(xlrd.__VERSION__) >= LooseVersion("0.9.3"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could put this in the decorator?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This conditional doesn't skip anything - it just sets a different expectation on the precision of the result. I suppose we could split this into two separate tests and skip one or the other depending on what's installed but I think that is more trouble than its worth, especially considering the current skip_if_no decorator only has a min_version
but not a max_version
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, no prolem.
pandas/tests/io/test_excel.py
Outdated
@@ -1907,10 +1780,10 @@ def test_invalid_columns(self): | |||
with pytest.raises(KeyError): | |||
write_frame.to_excel(path, 'test1', columns=['C', 'D']) | |||
|
|||
def test_comment_arg(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes that’s good thanks
I think my comment got garbled. ok to fix the wraps things by using the six version. just include it in |
Can you check to make sure this doesn't hit the issue described in pytest-dev/pytest#568? I don't recall the details, but it was something to do with applying skip marks to child classes causing parents & or siblings to skip, which sounds similar to what's changing here. |
@TomAugspurger thanks for sharing. Fortunately that issue should not affect this design here, as the only subclasses with class-level FWIW I compare the before / after results with various combinations of these packages installed in a virtual environment and the numbers matched in all cases, save the fact that there are 3 new tests as a result of parametrizing |
Great, thanks for checking.
…On Thu, Feb 22, 2018 at 10:21 AM, William Ayd ***@***.***> wrote:
@TomAugspurger <https://github.com/tomaugspurger> thanks for sharing.
Fortunately that issue should not affect this design here, as the only
subclasses with class-level skip_if decorators inherit _WriterBase, which
doesn't have any tests it needs to execute on its own.
FWIW I compare the before / after results with various combinations of
these packages installed in a virtual environment and the numbers matched
in all cases, save the fact that there are 3 new tests as a result of
parametrizing test_ExcelWriter_dispatch and creating
test_ExcelWriter_dispatch_raises
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#19829 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABQHIvPvSYJhOiVkTr7IwaF8rZDLk2peks5tXZQAgaJpZM4SOXaZ>
.
|
fb918f1
to
f5bc0ba
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm. just small request on docs. ping on green.
pandas/compat/__init__.py
Outdated
@@ -365,6 +365,18 @@ def callable(obj): | |||
return any("__call__" in klass.__dict__ for klass in type(obj).__mro__) | |||
|
|||
|
|||
if sys.version_info[0:2] < (3, 4): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
prob ok to just do this for PY2. can hyou add a comment and ref to the CPython bug report of what this is doing
version = getattr(sys.modules[mod_name], '__version__') | ||
try: | ||
version = getattr(sys.modules[mod_name], '__version__') | ||
except AttributeError: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ugg, really?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose another way to go about this would be do a str.lower
on the attribute, but I did it this way because it's clearer why there's even a need for it. Whenever xlrd becomes obsolete it's an easy identifier that we can get rid of it.
If you prefer the former then no problem I'll do that in next commit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI I put in a PR to xlrd to add this python-excel/xlrd#275 for future versions
import xlrd | ||
|
||
# Test reading times with and without milliseconds. GH5945. | ||
if LooseVersion(xlrd.__VERSION__) >= LooseVersion("0.9.3"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, no prolem.
f5bc0ba
to
6342ede
Compare
I removed xlwt (only) and got this:
|
6342ede
to
3f1152c
Compare
recons = read_excel(reader, 'test1') | ||
tm.assert_frame_equal(expected, recons) | ||
|
||
def test_to_excel_timedelta(self, merge_cells, engine, ext): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This failed without xlwt
because the extension was hard-coded to .xls
all the time, rather than using the ext
that the fixture provides for all other test cases. I updated that to use the fixture, but in the process discovered that this doesn't work correctly with openpyxl
so explicitly added a pytest.xfail
and plan to open an issue to address separately
self.frame.to_excel(self.path, 'test1', header=False) | ||
self.frame.to_excel(self.path, 'test1', index=False) | ||
|
||
@pytest.mark.parametrize("np_type", [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Figured I might as well parametrize this and the following two functions as part of this
pandas/tests/io/test_excel.py
Outdated
option_name = 'io.excel.{ext}.writer'.format(ext=ext.strip('.')) | ||
prev_engine = get_option(option_name) | ||
set_option(option_name, engine) | ||
with ensure_clean(ext) as path: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because every write
test uses ensure_clean
as a context manager I figured it made more sense to add to the fixture rather than duplicating for each test
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
totally fine. can you add some comments in the set_options, e.g. also a line explaining that this automatically used
# avoid mixed inferred_type | ||
df = DataFrame([[u'\u0192', u'\u0193', u'\u0194'], | ||
[u'\u0195', u'\u0196', u'\u0197']], | ||
index=[u'A\u0192', u'B'], | ||
columns=[u'X\u0193', u'Y', u'Z']) | ||
|
||
with ensure_clean('__tmp_to_excel_float_format__.' + self.ext)\ | ||
as filename: | ||
with ensure_clean('__tmp_to_excel_float_format__.' + ext) as filename: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This and the next test modified the name of the file generated by ensure_clean
. In these tests, the content manager within the setup fixture is therefore a no-op. Could move these into a separate class or do something tricky to prevent that, but I figure it was easiest just to keep as is
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rebase as well
pandas/tests/io/test_excel.py
Outdated
option_name = 'io.excel.{ext}.writer'.format(ext=ext.strip('.')) | ||
prev_engine = get_option(option_name) | ||
set_option(option_name, engine) | ||
with ensure_clean(ext) as path: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
totally fine. can you add some comments in the set_options, e.g. also a line explaining that this automatically used
53e5a6b
to
849933d
Compare
Just pushed updates. FWIW there was an error in the last Circle / Travis builds for I can't reproduce the issue locally on OSX with Py27 so perhaps it's a Linux-only issue? Waiting to see what happens this time around, but assuming it pops up again I would plan on imperatively skipping for the combination of Py27 and Linux for now and opening a separate issue to address that INSTALLED VERSIONScommit: 849933d pandas: 0.23.0.dev0+400.g849933d2c |
happy to hve in this one |
Tried locally and everything passed with '.xlsm' so I'd prefer to include in this. Will re-push after seeing output of latest commit in Travis/CircleCI |
|
||
if engine == 'openpyxl': | ||
pytest.xfail('Timedelta roundtrip broken with openpyxl') | ||
if engine == 'xlsxwriter' and (sys.version_info[0] == 2 and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This keeps failing on Travis with Py2, but AppVeyor was fine and I can't reproduce locally (macOS) so I assume it to be linux-only (?). Since this is a "new" test I figure just skip for now and I'll open an issue to see if someone on a Linux platform can help troubleshoot. If you have a better idea on how to manage let me know
thanks @WillAyd keep em coming! and ideally if you can submit PR's for the followups you noted (I think you created issues for all) |
git diff upstream/master -u -- "*.py" | flake8 --diff
Once I started looking at it, it made sense to make the module more "pytest compatible". There was a lot of skipping being done imperatively that seemed better suited to using class-level decorators, so as I moved some of that around I also replaced the setup/teardown mechanism of the "Excel Writing" tests with a pytest fixture