-
Notifications
You must be signed in to change notification settings - Fork 651
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FEAT-#0000: Implement DataFrame API standard #6928
Conversation
72b3b5b
to
5ab47be
Compare
kwargs: dict[str, Any], | ||
) -> None: | ||
df = integer_dataframe_1(library) | ||
ns = df.__dataframe_namespace__() |
Check warning
Code scanning / CodeQL
Variable defined multiple times Warning test
redefined
pytest.skip("TODO: enable for modin") | ||
df1 = integer_dataframe_1(library) | ||
df2 = integer_dataframe_2(library) | ||
ns = df1.__dataframe_namespace__() |
Check warning
Code scanning / CodeQL
Variable defined multiple times Warning test
redefined
|
||
def test_column_to_array_object(library: BaseHandler) -> None: | ||
col = integer_dataframe_1(library).col("a") | ||
result = np.asarray(col.persist().to_array()) |
Check warning
Code scanning / CodeQL
Variable defined multiple times Warning test
redefined
df1 = df.filter(df.col("a") > 1) | ||
df2 = df.filter(df.col("a") <= 1) | ||
|
||
df1 = df1.persist() |
Check notice
Code scanning / CodeQL
Unused local variable Note test
import numpy as np | ||
from pandas.api.types import is_extension_array_dtype | ||
|
||
import modin.dataframe_api_standard |
Check notice
Code scanning / CodeQL
Module is imported with 'import' and 'import from' Note
return df.shape # type: ignore[no-any-return] | ||
|
||
def group_by(self, *keys: str) -> GroupBy: | ||
from modin.dataframe_api_standard.group_by_object import GroupBy |
Check notice
Code scanning / CodeQL
Cyclic import Note
modin.dataframe_api_standard.group_by_object
|
||
import modin.pandas as pd | ||
from modin.dataframe_api_standard import Namespace | ||
from modin.dataframe_api_standard.dataframe_object import DataFrame |
Check notice
Code scanning / CodeQL
Cyclic import Note
modin.dataframe_api_standard.dataframe_object
failed_columns = self._df.columns.difference(result.columns) | ||
if len(failed_columns) > 0: # pragma: no cover | ||
msg = "Groupby operation could not be performed on columns " | ||
+f"{failed_columns}. Please drop them before calling group_by." |
Check notice
Code scanning / CodeQL
Statement has no effect Note
Signed-off-by: Anatoly Myachev <anatoly.myachev@intel.com>
Signed-off-by: Anatoly Myachev <anatoly.myachev@intel.com>
Signed-off-by: Anatoly Myachev <anatoly.myachev@intel.com>
Signed-off-by: Anatoly Myachev <anatoly.myachev@intel.com>
Signed-off-by: Anatoly Myachev <anatoly.myachev@intel.com>
5ab47be
to
0a7a637
Compare
Why we're merging this to our repo instead of https://github.com/data-apis/dataframe-api-compat? Is this already determined that the dataframe-api repo will be abandoned and won't accept contributions? |
In particular, I'm asking about this PR (data-apis/dataframe-api-compat#71), I've seen some fresh comments there from you. Are you expecting this to be merged? If so, do we need the PR in modin? |
I consider this change as temporary in order to be in time for the 0.27 release.
I don't think so.
I’m still sure that it can be merged, but it definitely won’t make it in time for tomorrow’s release. |
Why do we want this to be in 0.27 release? cc @YarShev |
There is an opportunity to provide these changes earlier. |
It is also worth noting that:
|
Signed-off-by: Anatoly Myachev <anatoly.myachev@intel.com>
Sure, I also thought the same. I'm just worried about how we're going to sync modin's repos with the PR to dataframes-api (data-apis/dataframe-api-compat#71). The diffs are massive, so applying review suggestions for the dataframes-api PR and possibly fixing bugs for the implementation in the Modin at the same time, could result into a severe unsynchrony between these two implementations. Don't we want to instead spend our efforts on the dataframes-api PR data-apis/dataframe-api-compat#71 (making a review by our team/working on suggestions together) and the future support of modin's implementation there? |
Closed in favour of #7196 |
What do these changes do?
This is an example of how DataFrame API protocol implementation might look like if it will be added directly to Modin, instead of https://github.com/data-apis/dataframe-api-compat.
flake8 modin/ asv_bench/benchmarks scripts/doc_checker.py
black --check modin/ asv_bench/benchmarks scripts/doc_checker.py
git commit -s
docs/development/architecture.rst
is up-to-date