Test Decorators and Better Pytest Integration in 'test_excel' #19829

WillAyd · 2018-02-21T21:33:39Z

closes TST: Use Skipif Decorators in 'test_excel.py' #19816
tests added / passed
passes git diff upstream/master -u -- "*.py" | flake8 --diff
whatsnew entry

Once I started looking at it, it made sense to make the module more "pytest compatible". There was a lot of skipping being done imperatively that seemed better suited to using class-level decorators, so as I moved some of that around I also replaced the setup/teardown mechanism of the "Excel Writing" tests with a pytest fixture

WillAyd · 2018-02-21T21:41:52Z

pandas/tests/io/test_excel.py

@@ -1907,10 +1780,10 @@ def test_invalid_columns(self):
            with pytest.raises(KeyError):
                write_frame.to_excel(path, 'test1', columns=['C', 'D'])

-    def test_comment_arg(self):


There were a few one-off functions like this that didn't previously have the skip_if_no_xlrd call in them However, when trying to run them without xlrd but say xlwt installed they would throw:

ModuleNotFoundError: No module named 'xlrd'

thrown by the ExcelFile constructor. So I'm assuming it was an oversight not to have these skipped, (they will be going forward with the class structure I put in place)

yes that’s good thanks

WillAyd · 2018-02-21T21:43:57Z

pandas/tests/io/test_excel.py

+        pytest.param(_XlwtWriter, '.xls', marks=pytest.mark.skipif(
+            not td.safe_import('xlwt'), reason='No xlwt'))
+    ])
+    def test_ExcelWriter_dispatch(self, klass, ext):


This test was pretty wonky to begin with. With the parametrization I've split out the tests and given preference to xlsxwriter if both that and openpyxl are installed with .xlsx files, which is what the original function was doing in a roundabout way

WillAyd · 2018-02-21T21:45:29Z

pandas/tests/io/test_excel.py

+    pytest.param('xlsxwriter', '.xlsx', marks=pytest.mark.skipif(
+        not td.safe_import('xlsxwriter'), reason='No xlsxwriter'))
+])
+class TestExcelWriter(_WriterBase):


Rather than creating subclasses and adding arbitrary class metadata it seemed easy enough and clearer to parametrize the writing tests for each engine and the extension it should be serving

Another thing that came to mind - want me to add parametrization for .xlsm with openpyxl and xlsxwriter? Neither were being tested before, but I think it makes sense to add those extensions for more coverage.

Happy to add here or in separate change if we don't want to expand the scope further

WillAyd · 2018-02-21T21:50:04Z

pandas/tests/io/test_excel.py

-    ext = '.xlsx'
-    engine_name = 'xlrd'
-    check_skip = staticmethod(_skip_if_no_xlrd)
+class _WriterBase(SharedItems):


The original ExcelWriterBase class had an explicit setup / teardown method to set / reset options before each test case. Because I parametrized that test the setup / teardown paradigm did not have visibility into the actual fixtures being passes, so setting up an autoused fixture here to wrap each test case was the best option to inspect the parameters and set / reset the appropriate excel options before and after test execution.

I made this it's own subclass because the three classes that test specific features of openpyxl, xlsxwriter and xlwt still benefit from having this fixture, although it does require them to provide some extra metadata (namely the merge_cells fixture) that they don't use.

If this is too wonky, I think an alternate approach would be to re-wire the tests to specifically pass the engine fixture as a keyword argument to their read functions (didn't explore that option in detail so could be wrong)

WillAyd · 2018-02-21T22:31:50Z

pandas/tests/io/test_excel.py

@@ -103,7 +67,7 @@ def get_csv_refdf(self, basename):
        dfref = read_csv(pref, index_col=0, parse_dates=True, engine='python')
        return dfref

-    def get_excelfile(self, basename):
+    def get_excelfile(self, basename, ext):


More verbose doing this, but with the parametrization of Test Classes performed further down in the module I think it's better to explicitly pass ext as an argument rather than creating subclasses that bind their own ext variable

WillAyd · 2018-02-21T23:38:40Z

The Travis network failures seem specific to Py27 (ran fine locally on Py3 but could reproduce with Py27). Assume that something is getting crossed up between the fixtures and the decorator for that test

WillAyd · 2018-02-22T00:47:01Z

So this is extremely nuanced but it looks like the combination of Py27, using functools.wraps and the tm.network decorator do not play nice together. I found another like issue open on Pytest's bug tracker and their suggestion was to use the wraps method from six to be backwards compatible with Py27.

Before making a change on that front I figure it's worth stopping here and getting feedback on the PR as is, but that again is one possible solution to the failures.

Here's the issue I mentioned:
pytest-dev/pytest#2782

And the "offender" in the pandas source:

pandas/pandas/util/testing.py

Line 2168 in 3b135c3

@wraps(t)

jreback · 2018-02-22T03:02:05Z

looks pretty reasonable
ok on fixing the wraps
thing can just add the six wraps inline in pandas.compat (or separate file of too big)

codecov · 2018-02-22T10:24:31Z

Codecov Report

Merging #19829 into master will decrease coverage by 0.01%.
The diff coverage is 53.84%.

@@            Coverage Diff             @@
##           master   #19829      +/-   ##
==========================================
- Coverage   91.66%   91.65%   -0.02%     
==========================================
  Files         150      150              
  Lines       48938    48949      +11     
==========================================
+ Hits        44860    44865       +5     
- Misses       4078     4084       +6

Flag	Coverage Δ
#multiple	`90.03% <53.84%> (-0.02%)`	⬇️
#single	`41.81% <53.84%> (ø)`	⬆️

Impacted Files	Coverage Δ
pandas/util/testing.py	`83.85% <100%> (ø)`	⬆️
pandas/util/_test_decorators.py	`92.42% <100%> (+0.36%)`	⬆️
pandas/compat/__init__.py	`57.62% <25%> (-1.15%)`	⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 92dbc78...872e7d6. Read the comment docs.

jreback · 2018-02-22T02:58:07Z

pandas/tests/io/test_excel.py

        import xlrd

+        # Test reading times with and without milliseconds. GH5945.
        if LooseVersion(xlrd.__VERSION__) >= LooseVersion("0.9.3"):


could put this in the decorator?

This conditional doesn't skip anything - it just sets a different expectation on the precision of the result. I suppose we could split this into two separate tests and skip one or the other depending on what's installed but I think that is more trouble than its worth, especially considering the current skip_if_no decorator only has a min_version but not a max_version

ok, no prolem.

jreback · 2018-02-22T03:00:50Z

pandas/tests/io/test_excel.py

@@ -1907,10 +1780,10 @@ def test_invalid_columns(self):
            with pytest.raises(KeyError):
                write_frame.to_excel(path, 'test1', columns=['C', 'D'])

-    def test_comment_arg(self):


yes that’s good thanks

jreback · 2018-02-22T11:33:31Z

I think my comment got garbled. ok to fix the wraps things by using the six version. just include it in pandas.compat (or as a separate file if too big).

TomAugspurger · 2018-02-22T15:31:11Z

Can you check to make sure this doesn't hit the issue described in pytest-dev/pytest#568?

I don't recall the details, but it was something to do with applying skip marks to child classes causing parents & or siblings to skip, which sounds similar to what's changing here.

WillAyd · 2018-02-22T16:21:15Z

@TomAugspurger thanks for sharing. Fortunately that issue should not affect this design here, as the only subclasses with class-level skip_if decorators inherit _WriterBase, which doesn't have any tests it needs to execute on its own.

FWIW I compare the before / after results with various combinations of these packages installed in a virtual environment and the numbers matched in all cases, save the fact that there are 3 new tests as a result of parametrizing test_ExcelWriter_dispatch and creating test_ExcelWriter_dispatch_raises

TomAugspurger · 2018-02-22T16:26:02Z

Great, thanks for checking.

…

On Thu, Feb 22, 2018 at 10:21 AM, William Ayd ***@***.***> wrote: @TomAugspurger <https://github.com/tomaugspurger> thanks for sharing. Fortunately that issue should not affect this design here, as the only subclasses with class-level skip_if decorators inherit _WriterBase, which doesn't have any tests it needs to execute on its own. FWIW I compare the before / after results with various combinations of these packages installed in a virtual environment and the numbers matched in all cases, save the fact that there are 3 new tests as a result of parametrizing test_ExcelWriter_dispatch and creating test_ExcelWriter_dispatch_raises — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#19829 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ABQHIvPvSYJhOiVkTr7IwaF8rZDLk2peks5tXZQAgaJpZM4SOXaZ> .

jreback

lgtm. just small request on docs. ping on green.

jreback · 2018-02-23T01:25:30Z

pandas/compat/__init__.py

@@ -365,6 +365,18 @@ def callable(obj):
        return any("__call__" in klass.__dict__ for klass in type(obj).__mro__)


+if sys.version_info[0:2] < (3, 4):


prob ok to just do this for PY2. can hyou add a comment and ref to the CPython bug report of what this is doing

jreback · 2018-02-23T01:25:52Z

pandas/util/_test_decorators.py

-        version = getattr(sys.modules[mod_name], '__version__')
+        try:
+            version = getattr(sys.modules[mod_name], '__version__')
+        except AttributeError:


ugg, really?

I suppose another way to go about this would be do a str.lower on the attribute, but I did it this way because it's clearer why there's even a need for it. Whenever xlrd becomes obsolete it's an easy identifier that we can get rid of it.

If you prefer the former then no problem I'll do that in next commit

FYI I put in a PR to xlrd to add this python-excel/xlrd#275 for future versions

jreback · 2018-02-23T01:26:49Z

pandas/tests/io/test_excel.py

        import xlrd

+        # Test reading times with and without milliseconds. GH5945.
        if LooseVersion(xlrd.__VERSION__) >= LooseVersion("0.9.3"):


ok, no prolem.

jreback · 2018-02-24T15:17:44Z

I removed xlwt (only) and got this:

______________________________________________________________________________________ TestExcelWriter.test_to_excel_timedelta[openpyxl-.xlsx-True] ______________________________________________________________________________________
pandas/tests/io/test_excel.py:1386: in test_to_excel_timedelta
    frame.to_excel(path, 'test1')
pandas/core/frame.py:1599: in to_excel
    engine=engine)
pandas/io/formats/excel.py:646: in write
    writer = ExcelWriter(_stringify_path(writer), engine=engine)
pandas/io/excel.py:1432: in __init__
    import xlwt
E   ModuleNotFoundError: No module named 'xlwt'
_____________________________________________________________________________________ TestExcelWriter.test_to_excel_timedelta[openpyxl-.xlsx-False] ______________________________________________________________________________________
pandas/tests/io/test_excel.py:1386: in test_to_excel_timedelta
    frame.to_excel(path, 'test1')
pandas/core/frame.py:1599: in to_excel
    engine=engine)
pandas/io/formats/excel.py:646: in write
    writer = ExcelWriter(_stringify_path(writer), engine=engine)
pandas/io/excel.py:1432: in __init__
    import xlwt
E   ModuleNotFoundError: No module named 'xlwt'
_____________________________________________________________________________________ TestExcelWriter.test_to_excel_timedelta[xlsxwriter-.xlsx-True] _____________________________________________________________________________________
pandas/tests/io/test_excel.py:1386: in test_to_excel_timedelta
    frame.to_excel(path, 'test1')
pandas/core/frame.py:1599: in to_excel
    engine=engine)
pandas/io/formats/excel.py:646: in write
    writer = ExcelWriter(_stringify_path(writer), engine=engine)
pandas/io/excel.py:1432: in __init__
    import xlwt
E   ModuleNotFoundError: No module named 'xlwt'
____________________________________________________________________________________ TestExcelWriter.test_to_excel_timedelta[xlsxwriter-.xlsx-False] _____________________________________________________________________________________
pandas/tests/io/test_excel.py:1386: in test_to_excel_timedelta
    frame.to_excel(path, 'test1')
pandas/core/frame.py:1599: in to_excel
    engine=engine)
pandas/io/formats/excel.py:646: in write
    writer = ExcelWriter(_stringify_path(writer), engine=engine)
pandas/io/excel.py:1432: in __init__
    import xlwt
E   ModuleNotFoundError: No module named 'xlwt'
=========================================================================================== 4 failed, 293 passed, 116 skipped in 13.11 seconds ===========================================================================================

WillAyd · 2018-02-25T00:50:58Z

pandas/tests/io/test_excel.py

+        recons = read_excel(reader, 'test1')
+        tm.assert_frame_equal(expected, recons)
+
+    def test_to_excel_timedelta(self, merge_cells, engine, ext):


This failed without xlwt because the extension was hard-coded to .xls all the time, rather than using the ext that the fixture provides for all other test cases. I updated that to use the fixture, but in the process discovered that this doesn't work correctly with openpyxl so explicitly added a pytest.xfail and plan to open an issue to address separately

WillAyd · 2018-02-25T00:51:46Z

pandas/tests/io/test_excel.py

+        self.frame.to_excel(self.path, 'test1', header=False)
+        self.frame.to_excel(self.path, 'test1', index=False)
+
+    @pytest.mark.parametrize("np_type", [


Figured I might as well parametrize this and the following two functions as part of this

WillAyd · 2018-02-25T00:53:19Z

pandas/tests/io/test_excel.py

+        option_name = 'io.excel.{ext}.writer'.format(ext=ext.strip('.'))
+        prev_engine = get_option(option_name)
+        set_option(option_name, engine)
+        with ensure_clean(ext) as path:


Because every write test uses ensure_clean as a context manager I figured it made more sense to add to the fixture rather than duplicating for each test

totally fine. can you add some comments in the set_options, e.g. also a line explaining that this automatically used

WillAyd · 2018-02-25T00:56:16Z

pandas/tests/io/test_excel.py

        # avoid mixed inferred_type
        df = DataFrame([[u'\u0192', u'\u0193', u'\u0194'],
                        [u'\u0195', u'\u0196', u'\u0197']],
                       index=[u'A\u0192', u'B'],
                       columns=[u'X\u0193', u'Y', u'Z'])

-        with ensure_clean('__tmp_to_excel_float_format__.' + self.ext)\
-                as filename:
+        with ensure_clean('__tmp_to_excel_float_format__.' + ext) as filename:


This and the next test modified the name of the file generated by ensure_clean. In these tests, the content manager within the setup fixture is therefore a no-op. Could move these into a separate class or do something tricky to prevent that, but I figure it was easiest just to keep as is

jreback

rebase as well

jreback · 2018-02-25T16:19:11Z

pandas/tests/io/test_excel.py

+        option_name = 'io.excel.{ext}.writer'.format(ext=ext.strip('.'))
+        prev_engine = get_option(option_name)
+        set_option(option_name, engine)
+        with ensure_clean(ext) as path:


totally fine. can you add some comments in the set_options, e.g. also a line explaining that this automatically used

WillAyd · 2018-02-26T01:16:22Z

Just pushed updates. FWIW there was an error in the last Circle / Travis builds for TestExcelWriter.test_to_excel_timedelta. That test that originally hard-coded to use '.xls' files so was never testing for openpyxl and xlsxwriter. I'm imperatively skipping the former as that was not working and needs to be addressed separately (see #19900), but the later seems to work for most configurations.

I can't reproduce the issue locally on OSX with Py27 so perhaps it's a Linux-only issue? Waiting to see what happens this time around, but assuming it pops up again I would plan on imperatively skipping for the combination of Py27 and Linux for now and opening a separate issue to address that

INSTALLED VERSIONS

commit: 849933d
python: 2.7.14.final.0
python-bits: 64
OS: Darwin
OS-release: 17.4.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: None.None

pandas: 0.23.0.dev0+400.g849933d2c
pytest: 3.2.3
pip: 9.0.1
setuptools: 36.6.0
Cython: 0.27.3
numpy: 1.13.3
scipy: None
pyarrow: None
xarray: None
IPython: 5.5.0
sphinx: None
patsy: None
dateutil: 2.6.1
pytz: 2017.3
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: None
openpyxl: 2.5.0
xlrd: 1.1.0
xlwt: 1.3.0
xlsxwriter: 1.0.2
lxml: None
bs4: None
html5lib: None
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: None
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None

jreback · 2018-02-26T01:23:54Z

happy to hve in this one
but if it is holding this back can do in a follow up
lmk

WillAyd · 2018-02-26T01:26:32Z

Tried locally and everything passed with '.xlsm' so I'd prefer to include in this. Will re-push after seeing output of latest commit in Travis/CircleCI

WillAyd · 2018-02-26T02:34:03Z

pandas/tests/io/test_excel.py

-
+        if engine == 'openpyxl':
+            pytest.xfail('Timedelta roundtrip broken with openpyxl')
+        if engine == 'xlsxwriter' and (sys.version_info[0] == 2 and


This keeps failing on Travis with Py2, but AppVeyor was fine and I can't reproduce locally (macOS) so I assume it to be linux-only (?). Since this is a "new" test I figure just skip for now and I'll open an issue to see if someone on a Linux platform can help troubleshoot. If you have a better idea on how to manage let me know

jreback · 2018-02-26T12:21:46Z

thanks @WillAyd keep em coming!

and ideally if you can submit PR's for the followups you noted (I think you created issues for all)

…-dev#19829)

WillAyd commented Feb 21, 2018

View reviewed changes

WillAyd changed the title ~~Excel skipif decs~~ Test Decorators and Better Pytest Integration in 'test_excel' Feb 21, 2018

WillAyd commented Feb 21, 2018

View reviewed changes

jreback requested changes Feb 22, 2018

View reviewed changes

jreback added Testing pandas testing functions or related to the test suite IO Excel read_excel, to_excel labels Feb 22, 2018

WillAyd force-pushed the excel-skipif-decs branch from fb918f1 to f5bc0ba Compare February 22, 2018 16:37

jreback requested changes Feb 23, 2018

View reviewed changes

WillAyd force-pushed the excel-skipif-decs branch from f5bc0ba to 6342ede Compare February 23, 2018 02:39

jreback added this to the 0.23.0 milestone Feb 24, 2018

WillAyd force-pushed the excel-skipif-decs branch from 6342ede to 3f1152c Compare February 25, 2018 00:46

WillAyd commented Feb 25, 2018

View reviewed changes

jreback requested changes Feb 25, 2018

View reviewed changes

WillAyd added 2 commits February 25, 2018 15:44

Added decorator to XlrdTests

377edd8

Replaced subclasses with parametrization

c8b7f28

WillAyd added 15 commits February 25, 2018 15:44

Moved xlrd skip to SharedItems

44fa2f9

Parametrized ExcelWriter subclasses

069d700

Finished up decorator cleanup for reads, linted

0f64f3a

Fixed issue with xlwt install

ed761ef

Cleaned up all Test{} writer tests

2dff85d

Converted Writer setup/teardown to pytest fixture

1976804

Final refactor to remove _skip_if_no module funcs

d6bb4aa

LINTing

d885b93

Added wraps to compat for Py27 testing

a681e5e

Comments for compat.wraps requirement

8035620

Removed hard-coded ensure_clean

0cfdaf7

Futher parametrization

96708be

Moved ensure_clean to class fixture

9b16035

Changed pytest.skip to pytest.xfail for visibility

164ee9c

Updated docstring and name of class test fixture

849933d

WillAyd force-pushed the excel-skipif-decs branch from 53e5a6b to 849933d Compare February 25, 2018 23:45

WillAyd mentioned this pull request Feb 26, 2018

Openpyxl Timedelta Behavior Inconsistent in Excel Roundtrip #19900

Closed

WillAyd added 2 commits February 25, 2018 18:24

Parametrized .xlsm writing tests

112dae4

Added xfail for Py2, Linux, xlsxwriter combo

872e7d6

WillAyd commented Feb 26, 2018

View reviewed changes

jreback approved these changes Feb 26, 2018

View reviewed changes

jreback merged commit 1e4c50a into pandas-dev:master Feb 26, 2018

WillAyd deleted the excel-skipif-decs branch February 26, 2018 16:15

WillAyd mentioned this pull request Feb 26, 2018

Xlsxwriter, Py27 and Linux Failure on to_excel with Timedelta #19911

Closed

harisbal pushed a commit to harisbal/pandas that referenced this pull request Feb 28, 2018

Test Decorators and Better Pytest Integration in 'test_excel' (pandas…

3471271

…-dev#19829)

		@@ -365,6 +365,18 @@ def callable(obj):
		return any("__call__" in klass.__dict__ for klass in type(obj).__mro__)


		if sys.version_info[0:2] < (3, 4):

Test Decorators and Better Pytest Integration in 'test_excel' #19829

Test Decorators and Better Pytest Integration in 'test_excel' #19829

Conversation

WillAyd commented Feb 21, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

WillAyd commented Feb 21, 2018

WillAyd commented Feb 22, 2018 • edited Loading

jreback commented Feb 22, 2018 • edited Loading

codecov bot commented Feb 22, 2018 • edited Loading

Codecov Report

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jreback commented Feb 22, 2018

TomAugspurger commented Feb 22, 2018

WillAyd commented Feb 22, 2018

TomAugspurger commented Feb 22, 2018 via email

jreback left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jreback commented Feb 24, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jreback left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

WillAyd commented Feb 26, 2018

INSTALLED VERSIONS

jreback commented Feb 26, 2018

WillAyd commented Feb 26, 2018

Choose a reason for hiding this comment

jreback commented Feb 26, 2018

WillAyd commented Feb 21, 2018 •

edited

Loading

WillAyd commented Feb 22, 2018 •

edited

Loading

jreback commented Feb 22, 2018 •

edited

Loading

codecov bot commented Feb 22, 2018 •

edited

Loading