-
Notifications
You must be signed in to change notification settings - Fork 651
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FIX-#5097: Stop using deprecated mangle_dup_cols. #5104
Conversation
Signed-off-by: mvashishtha <mahesh@ponder.io>
Codecov Report
@@ Coverage Diff @@
## master #5104 +/- ##
==========================================
+ Coverage 84.56% 87.39% +2.82%
==========================================
Files 256 257 +1
Lines 19346 19632 +286
==========================================
+ Hits 16360 17157 +797
+ Misses 2986 2475 -511
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
@mvashishtha CI is failing, and failures seem to be relevant to PR title, please have a look |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Let's merge once CI is green!
Signed-off-by: mvashishtha <mahesh@ponder.io>
@vnlitvinov I think I fixed the failures, which were for HDK and MODIN_ENGINE=PYTHON. Let's see whether my fixes work in CI. |
@@ -149,6 +149,9 @@ def read_csv( | |||
val.name for val in inspect.signature(pandas.read_csv).parameters.values() | |||
} | |||
_, _, _, f_locals = inspect.getargvalues(inspect.currentframe()) | |||
# mangle_dupe_cols has no effect starting in pandas 1.5. Exclude it from | |||
# kwargs so pandas doesn't spuriously warn people not to use it. | |||
f_locals.pop("mangle_dupe_cols", None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are a couple of other values that are also deprecated for read_csv
. See: https://pandas.pydata.org/docs/reference/api/pandas.read_csv.html . Perhaps we can address them in another issue
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was surprised to find we weren't getting warnings for prefix
and squeeze
even though both were deprecated. It turns out we pop squeeze here:
squeeze = kwargs.pop("squeeze", False) |
and passing the default of prefix
doesn't trigger the warning. not sure why.
import pandas as pd
from io import StringIO
from pandas.api.extensions import no_default
# no warning
print(pd.read_csv(StringIO("a,b,c\n1,2,3"), header=None, prefix=no_default))
# warns
print(pd.read_csv(StringIO("a,b,c\n1,2,3"), header=None, prefix='z'))
Signed-off-by: mvashishtha <mahesh@ponder.io>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
modin/pandas/test/test_io.py
Outdated
pytest.warns( | ||
FutureWarning, match="'mangle_dupe_cols' keyword is deprecated" | ||
) | ||
if version.parse(pandas.__version__) >= version.parse("1.5.0") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there a reason to not use the PandasCompatVersion
class?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nope. I've reverted back to PandasCompatVersion
for consistency with the other tests.
Signed-off-by: mvashishtha <mahesh@ponder.io>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM - but a minor question - should Modin handle processing so that the mangle_dupe_cols
argument works the same in compat mode? That way folks who expect pandas 1.1 in compat mode will get the experience they expect
@RehanSD |
Ah mb - I forgot that users will have the old version of pandas installed in compat mode! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
…-project#5104) Signed-off-by: mvashishtha <mahesh@ponder.io>
Signed-off-by: mvashishtha mahesh@ponder.io
What do these changes do?
flake8 modin/ asv_bench/benchmarks scripts/doc_checker.py
black --check modin/ asv_bench/benchmarks scripts/doc_checker.py
git commit -s
docs/development/architecture.rst
is up-to-date