-
Notifications
You must be signed in to change notification settings - Fork 309
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(bigquery): unit and system test for dataframe with int column with Nan values #39
feat(bigquery): unit and system test for dataframe with int column with Nan values #39
Conversation
12345bb
to
b906717
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's good, just have two suggestion on improving the skipIf
decorators.
tests/system.py
Outdated
@@ -736,6 +736,65 @@ def test_load_table_from_dataframe_w_automatic_schema(self): | |||
) | |||
self.assertEqual(table.num_rows, 3) | |||
|
|||
@unittest.skipIf( | |||
pandas is None or pandas.__version__ < "1.0.1", | |||
"Only `pandas version >=1.0.1` supports", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Only `pandas version >=1.0.1` supports", | |
"Only `pandas version >=1.0.1` supported", |
Seems like more correct? (disclaimer: not a native speaker)
tests/system.py
Outdated
@@ -736,6 +736,65 @@ def test_load_table_from_dataframe_w_automatic_schema(self): | |||
) | |||
self.assertEqual(table.num_rows, 3) | |||
|
|||
@unittest.skipIf( | |||
pandas is None or pandas.__version__ < "1.0.1", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thought - I think comparing plain version strings will actually work in this particular case, but cannot be generalized (e.g. somebody copies this pattern over to another test, or if we have to change the version for some reason).
Let's instead add a library like packaging as a test dependency, and use that for parsing and semantically comparing versions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW, why pandas version 1.0.1+
, as the code sample from the issue description also seems to works in pandas 1.0.0?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, but need suggestion on version comparison like should i use parse(pandas.__version__) < parse("1.0.0")
or should i change the pandas required minimum version in setup.py file and skip that if PY2 or something else?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the issue was due to a bug in Pandas, it's probably best to directly compare the version of Pandas rather than a proxy value such as the Python version.
Re: bumping the version - if the new Pandas version is still compatible with Python 2.7, we can bump it (we have tests...), otherwise we will have to wait until April or so when we officially drop Python 2 support (IIRC).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
New version of pandas isn't compatible with Python 2.7, it Requires: Python >=3.6.1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. We could still conditionally bump the pandas version for Python 3.5+ in setup.py
, though, to at least guarantee correct behavior in Python 3, even though we cannot fix the bug in Python 2.7 due to pandas incompatibility.
@tswast What do you think?
P.S.: If there is an efficient way of detecting whether a column contains a NaN
value, we could also emit a user warning if too old a pandas version is installed... just a thought.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tswast bump
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have setuptools as a required dependency, so why aren't you using pkg_resources
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A lot of places (including Google) are pretty slow to update pandas, so I'd hesitate to require 1.0+ just yet.
tests/system.py
Outdated
"Only `pandas version >=1.0.1` is supported", | ||
) | ||
@unittest.skipIf(pyarrow is None, "Requires `pyarrow`") | ||
def test_load_table_from_dataframe_w_none_value(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The nullable Int64
data type is the real reason for the test. There are plenty of existing tests & samples with loading None values object
dtype.
@shollyman Friendly ping. :) @HemangChothani Please just bring the branch up to date in the meanwhile, thanks! |
…into bigquery_issue_22
…into bigquery_issue_22
@shollyman PTAL! |
…into bigquery_issue_22
…into bigquery_issue_22
@shollyman Friendly ping. :) |
tests/system.py
Outdated
@@ -125,6 +126,9 @@ | |||
(TooManyRequests, InternalServerError, ServiceUnavailable) | |||
) | |||
|
|||
PANDAS_MINIUM_VERSION = pkg_resources.parse_version("1.0.0") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/MINIUM/MINIMUM/ and all the references
…into bigquery_issue_22
…into bigquery_issue_22
…into bigquery_issue_22
Fixes #22