-
Notifications
You must be signed in to change notification settings - Fork 651
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FIX-#4022: Fixed empty data frame with index #4910
Conversation
Codecov Report
@@ Coverage Diff @@
## master #4910 +/- ##
===========================================
- Coverage 84.34% 69.26% -15.08%
===========================================
Files 267 267
Lines 19749 19746 -3
===========================================
- Hits 16657 13678 -2979
- Misses 3092 6068 +2976
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
1a05283
to
3f41a42
Compare
3f41a42
to
5ea0cbb
Compare
modin/pandas/indexing.py
Outdated
@@ -653,7 +653,7 @@ def __getitem__(self, key): | |||
-------- | |||
pandas.DataFrame.loc | |||
""" | |||
if self.df.empty: | |||
if len(self.df.index) == 0 and self.df.empty: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you need these changes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the frame has only index, the following lambda fails with KeyError.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the problem specific to OmniSci backend or to all backends? Probably, the problem is in backend implementation of empty
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, I could find only one implementation of this method - https://github.com/modin-project/modin/blob/master/modin/pandas/dataframe.py#L325
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand why making a condition for default to pandas more strict might resolve any issue. We should be able to always be able to default to pandas.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right. Actually, the problem is not in default_to_pandas() but in loc[], which is called from lambda. Pandas returns an empty Series for df.loc[1], but modin with omnisci backend fails with KeyError: 'Column F_1 does not exist in schema'. This issue should be fixed instead.
97a1e31
to
fe3fde3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
fe3fde3
to
6e110f7
Compare
Put index into columns if there are no columns but only index Signed-off-by: Andrey Pavlenko <andrey.a.pavlenko@gmail.com>
6e110f7
to
c1206c0
Compare
What do these changes do?
Put index into columns if there are no columns but only index.
flake8 modin/ asv_bench/benchmarks scripts/doc_checker.py
black --check modin/ asv_bench/benchmarks scripts/doc_checker.py
git commit -s
dataframe.loc()
doesn`t work correctly at MODIN_STORAGE_FORMAT=omnisci #4022docs/development/architecture.rst
is up-to-date