-
-
Notifications
You must be signed in to change notification settings - Fork 18.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
io/parsers: ensure decimal is str on PythonParser #29743
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR! Looks pretty good some minor edits
CI failure is unrelated if you merge master should be resolved
pandas/io/parsers.py
Outdated
@@ -2344,6 +2344,9 @@ def __init__(self, f, **kwds): | |||
if len(self.decimal) != 1: | |||
raise ValueError("Only length-1 decimal markers supported") | |||
|
|||
if isinstance(self.decimal, bytes): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you decode before assigning to decimal
instead?
@pytest.mark.parametrize("add_footer", [True, False]) | ||
def test_skipfooter_with_decimal(python_parser_only, add_footer): | ||
@pytest.mark.parametrize( | ||
"add_footer, decimal", [(True, "#"), (False, "#"), (True, b"#"), (False, b"#")], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"add_footer, decimal", [(True, "#"), (False, "#"), (True, b"#"), (False, b"#")], | |
@pytest.mark.parametrize("add_footer", [True, False]) | |
@pytest.mark.parametrize("decimal", ["#", b"#"]) |
Can just stack the decorates instead of manually creating tuples
Also can you add a whatsnew note for v1.0.0? |
@WillAyd thanks for the feedback! 🎉 Just pushed the changes you suggested and rebased master, please take another look. |
pandas/io/parsers.py
Outdated
@@ -2276,7 +2276,11 @@ def __init__(self, f, **kwds): | |||
|
|||
self.dtype = kwds["dtype"] | |||
self.thousands = kwds["thousands"] | |||
self.decimal = kwds["decimal"] | |||
|
|||
if isinstance(kwds["decimal"], bytes): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we don't accept bytes as parameters, they must be strings, this is just a mistake.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh in that case the fix is just the one in line 491? x)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW, decimal is almost always bytes:
Line 571 in 6d11aa8
decimal=b".", |
I agree it's a mistake though, that's why I opened #29653 (and we internally have a workaround where we patch pandas so we can proceed with our Python 3 migration).
Let me know if I should just go ahead and drop support for bytes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea just go ahead and remove this block
pandas/io/parsers.py
Outdated
@@ -2276,7 +2276,11 @@ def __init__(self, f, **kwds): | |||
|
|||
self.dtype = kwds["dtype"] | |||
self.thousands = kwds["thousands"] | |||
self.decimal = kwds["decimal"] | |||
|
|||
if isinstance(kwds["decimal"], bytes): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea just go ahead and remove this block
pandas/io/parsers.py
Outdated
@@ -568,7 +568,7 @@ def parser_f( | |||
# Quoting, Compression, and File Format | |||
compression="infer", | |||
thousands=None, | |||
decimal=b".", | |||
decimal=".", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you annotate this while modifying? Should just be
decimal: str = "."
thanks @fsouza |
🎉 |
…ndexing-1row-df * upstream/master: (185 commits) ENH: add BooleanArray extension array (pandas-dev#29555) DOC: Add link to dev calendar and meeting notes (pandas-dev#29737) ENH: Add built-in function for Styler to format the text displayed for missing values (pandas-dev#29118) DEPR: remove statsmodels/seaborn compat shims (pandas-dev#29822) DEPR: remove Index.summary (pandas-dev#29807) DEPR: passing an int to read_excel use_cols (pandas-dev#29795) STY: fstrings in io.pytables (pandas-dev#29758) BUG: Fix melt with mixed int/str columns (pandas-dev#29792) TST: add test for ffill/bfill for non unique multilevel (pandas-dev#29763) Changed description of parse_dates in read_excel(). (pandas-dev#29796) BUG: pivot_table not returning correct type when margin=True and aggfunc='mean' (pandas-dev#28248) REF: Create _lib/window directory (pandas-dev#29817) Fixed small mistake (pandas-dev#29815) minor cleanups (pandas-dev#29798) DEPR: enforce deprecations in core.internals (pandas-dev#29723) add test for unused level raises KeyError (pandas-dev#29760) Add documentation linking to sqlalchemy (pandas-dev#29373) io/parsers: ensure decimal is str on PythonParser (pandas-dev#29743) Reenabled no-unused-function (pandas-dev#29767) CLN:F-string in pandas/_libs/tslibs/*.pyx (pandas-dev#29775) ... # Conflicts: # pandas/tests/frame/indexing/test_indexing.py
black pandas
git diff upstream/master -u -- "*.py" | flake8 --diff