Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Split test_excel.py (#24472) #24749

Closed
wants to merge 6 commits into from
Closed

Conversation

stevenbw
Copy link

Split tests/io/test_excel.py into two files, base.py and xlrd.py located in tests/io/excel/. The previous file was very large and a logical split has been applied to separate the xlrd tests.

@pep8speaks
Copy link

pep8speaks commented Jan 13, 2019

Hello @stevenbw! Thanks for updating the PR.

Cheers ! There are no PEP8 issues in this Pull Request. 🍻

Comment last updated on January 20, 2019 at 11:34 Hours UTC

@codecov
Copy link

codecov bot commented Jan 13, 2019

Codecov Report

Merging #24749 into master will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##           master   #24749   +/-   ##
=======================================
  Coverage   92.38%   92.38%           
=======================================
  Files         166      166           
  Lines       52358    52358           
=======================================
  Hits        48373    48373           
  Misses       3985     3985
Flag Coverage Δ
#multiple 90.81% <ø> (ø) ⬆️
#single 43.07% <ø> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 33f91d8...f6309d9. Read the comment docs.

@codecov
Copy link

codecov bot commented Jan 13, 2019

Codecov Report

Merging #24749 into master will increase coverage by <.01%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #24749      +/-   ##
==========================================
+ Coverage   92.38%   92.39%   +<.01%     
==========================================
  Files         166      166              
  Lines       52358    52358              
==========================================
+ Hits        48373    48375       +2     
+ Misses       3985     3983       -2
Flag Coverage Δ
#multiple 90.81% <ø> (ø) ⬆️
#single 43.08% <ø> (ø) ⬆️
Impacted Files Coverage Δ
pandas/util/testing.py 88.19% <0%> (+0.18%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 33f91d8...05467ea. Read the comment docs.

@simonjayhawkins
Copy link
Member

@stevenbw The files will need to be named test_base.py and test_xlrd.py

@@ -128,96 +77,6 @@ def set_engine(self, request):
yield
setattr(self, func_name, old_func)

@td.skip_if_no("xlrd", "1.0.1") # see gh-22682
def test_usecols_int(self, ext):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you'll probably just want to split by complete classes in this PR and leave refactoring the classes to a follow-on PR.



@contextlib.contextmanager
def ignore_xlrd_time_clock_warning():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any code required by both test_base.py and test_xlrd.py should probably be in test_base.py

class XlrdReadingTestsBase(XlrdSharedItems):
# This is based on ExcelWriterBase

@pytest.fixture(autouse=True, params=['xlrd'])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

code duplicated from test_base


class ReadingTestsBase(SharedItems):
# This is based on ExcelWriterBase

@pytest.fixture(autouse=True, params=['xlrd', None])
@pytest.fixture(autouse=True, params=[None])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe a better name for test_base would be test_common or test_all_engines. this split is creating too much code duplication. maybe a split on reader/writer tests first before splitting out xlrd

@WillAyd
Copy link
Member

WillAyd commented Jan 13, 2019

Agreed high level with @simonjayhawkins think there are just too many things moved to test_xlrd. If anything the entire SharedItems class you have can remain in the base file.

Did this yield the same number of tests before and after?

@WillAyd WillAyd added Testing pandas testing functions or related to the test suite IO Excel read_excel, to_excel Clean labels Jan 13, 2019
Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so before this is actually split; it needs to be fixturized a bit more I think. not really sure the best way of doing this, but if you are just moving files, then that can be in a single PR. We don't want to mix moving things with changing things as its very hard to tell what is going on. So first move w/o any changes, then new PR' to change.

@stevenbw
Copy link
Author

Thanks for the support @simonjayhawkins, @WillAyd, @jreback :). I have taken on board your comments and attempted to simply split the file without separating classes or duplicating code. However, due to classes relying on previous classes, for example TestXlrdReader, which could belong to test_xlrd.py requires ReadingTestsBase, which belongs to test_all_engines.py, there is only a small amount of code which can be moved. Other than TestXlrdReader there is only ignore_xlrd_time_clock_warning which could be in a separate file but it is used several times in ReadingTestsBase, it could be imported but doesn't seem to be of much benefit.

Unless I am missing something, it seems the ideal solution would be to rewrite the classes so they can be decoupled. I appreciate this would be a big change though.

@simonjayhawkins
Copy link
Member

@stevenbw the biggest classes by far are ReadingTestsBase and TestExcelWriter. i think these would need to be in separate files in order to make an impact on reducing the test file size. could the classes be grouped for this to work?

@stevenbw
Copy link
Author

@simonjayhawkins Yes they could. They both have a fundamental dependency on SharedItems but one option could be to import this class, then have one file test_reader.py for ReadingTestsBase and TestXlrdReder. The other file could contain all other classes and be called test_writer.py.

@WillAyd
Copy link
Member

WillAyd commented Feb 12, 2019

Can you merge master here? Also we just refactored the excel code in #25153 so you may be able to mimic that set up on the test side

@stevenbw
Copy link
Author

Yes I will take a look, should be able to mimic what you have done.

@WillAyd
Copy link
Member

WillAyd commented Mar 9, 2019

@stevenbw I think this is a duplicate of #25334 but if that's not the case let me know.

Generally should just keep pushing updates to the same PR rather than opening a new one

@WillAyd WillAyd closed this Mar 9, 2019
@WillAyd WillAyd added the Duplicate Report Duplicate issue or pull request label Mar 9, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Clean Duplicate Report Duplicate issue or pull request IO Excel read_excel, to_excel Testing pandas testing functions or related to the test suite
Projects
None yet
Development

Successfully merging this pull request may close these issues.

TST: Split test_excel into sub test files
5 participants