Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FEAT-#5423: Add a NumPy API to Modin #5422

Merged
merged 43 commits into from
Feb 9, 2023
Merged
Show file tree
Hide file tree
Changes from 34 commits
Commits
Show all changes
43 commits
Select commit Hold shift + click to select a range
6f2a6d7
FEAT-#5423: Begin implementing NumPy API Layer
RehanSD Dec 12, 2022
7a4fa99
Start
devin-petersohn Nov 12, 2022
2a08cf0
Next
devin-petersohn Nov 12, 2022
4b68f50
Added absolute, abs, add, all, subtract to modin.numpy
billiam-wang Nov 12, 2022
0b915b4
Add changes
devin-petersohn Nov 22, 2022
9c7a66b
Add shape + reshape
RehanSD Nov 12, 2022
1c6d708
Added additional math functions for numpy
billiam-wang Nov 12, 2022
30171d2
Add list constructor
RehanSD Nov 15, 2022
ab0ecdb
lint
RehanSD Dec 12, 2022
25510bc
Add dimension handling
RehanSD Jan 12, 2023
4ef400f
Merge remote-tracking branch 'upstream/master' into numpy/init
RehanSD Jan 12, 2023
43e3bb5
Fix partial broadcasting issues
RehanSD Jan 12, 2023
4301b9d
Add testing
RehanSD Jan 12, 2023
5ceca02
Add tests to CI
RehanSD Jan 12, 2023
ff9045c
Add __array_ufunc__, __array_function__, and clean up implementation …
RehanSD Feb 2, 2023
4b174de
Add where
RehanSD Feb 3, 2023
bd2fe98
Fix df conversion retaining index issue
RehanSD Feb 3, 2023
2a87a39
Add max and min and other numpy methods to namespace
RehanSD Feb 4, 2023
d513b03
Fix dtype handling
RehanSD Feb 5, 2023
d8d0d10
Fix keepdims
RehanSD Feb 5, 2023
7404cb3
Fix out and add
RehanSD Feb 5, 2023
0d3be93
Add support for where kwarg
RehanSD Feb 5, 2023
508ecb3
Fix lint
RehanSD Feb 5, 2023
90aaed7
Get tests to run
RehanSD Feb 5, 2023
88aa6b5
Add testing for array ufunc
RehanSD Feb 5, 2023
e0fb8ce
Add testing for array function
RehanSD Feb 5, 2023
db3db83
Add testing for where
RehanSD Feb 5, 2023
f176ac8
Add tests for everything but prod, mean, min, max, and sum
RehanSD Feb 5, 2023
c0a1ecc
Add tests
RehanSD Feb 6, 2023
e706796
Bypass overflow dtype issues
RehanSD Feb 6, 2023
23fe0c4
Cast to output dtype
RehanSD Feb 6, 2023
22b01e0
Fix lint
RehanSD Feb 6, 2023
52f0928
Add defensive dimension check
RehanSD Feb 6, 2023
cfaa066
Fix auto-cast issue
RehanSD Feb 6, 2023
f9be32d
Fix CI bug
RehanSD Feb 6, 2023
48967d8
Address review comments
RehanSD Feb 7, 2023
5b1da61
Fix type computation and add check for where
RehanSD Feb 7, 2023
74ed3a2
Fix auto broadcast of out variable
RehanSD Feb 7, 2023
3244b79
Address review comments (break up testing into multiple files, and fi…
RehanSD Feb 7, 2023
a3d57fe
Fix naming
RehanSD Feb 7, 2023
18588b8
Add to_numpy
RehanSD Feb 8, 2023
19acf64
Add warning about numpy api and fix lint
RehanSD Feb 8, 2023
1bf6c00
Fix lint
RehanSD Feb 8, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -623,6 +623,7 @@ jobs:
- run: mpiexec -n 1 python -m pytest modin/pandas/test/test_groupby.py
- run: mpiexec -n 1 python -m pytest modin/pandas/test/test_reshape.py
- run: mpiexec -n 1 python -m pytest modin/pandas/test/test_general.py
- run: mpiexec -n 1 python -m pytest modin/numpy/test/test_array.py
- run: chmod +x ./.github/workflows/sql_server/set_up_sql_server.sh
- run: ./.github/workflows/sql_server/set_up_sql_server.sh
- run: mpiexec -n 1 python -m pytest modin/pandas/test/test_io.py --verbose
Expand Down Expand Up @@ -710,6 +711,7 @@ jobs:
- run: python -m pytest -n 2 modin/pandas/test/test_series.py
- run: python -m pytest -n 2 modin/pandas/test/test_rolling.py
- run: python -m pytest -n 2 modin/pandas/test/test_concat.py
- run: python -m pytest -n 2 modin/numpy/test/test_array.py
if: matrix.engine == 'python'
- run: python -m pytest modin/pandas/test/test_concat.py # Ray and Dask versions fails with -n 2
if: matrix.engine != 'python'
Expand Down Expand Up @@ -842,6 +844,7 @@ jobs:
- modin/pandas/test/test_reshape.py
- modin/pandas/test/test_general.py
- modin/pandas/test/test_io.py
- modin/numpy/test/test_array.py
env:
MODIN_ENGINE: ${{matrix.engine}}
name: test-windows
Expand Down
2 changes: 2 additions & 0 deletions .github/workflows/push-to-master.yml
Original file line number Diff line number Diff line change
Expand Up @@ -55,6 +55,7 @@ jobs:
python -m pytest modin/pandas/test/dataframe/test_udf.py
python -m pytest modin/pandas/test/dataframe/test_window.py
python -m pytest modin/pandas/test/test_series.py
python -m pytest modin/numpy/test/test_array.py
python -m pytest modin/pandas/test/test_rolling.py
python -m pytest modin/pandas/test/test_concat.py
python -m pytest modin/pandas/test/test_groupby.py
Expand Down Expand Up @@ -121,6 +122,7 @@ jobs:
- modin/pandas/test/dataframe/test_window.py
- modin/pandas/test/dataframe/test_pickle.py
- modin/pandas/test/test_series.py
- modin/numpy/test/test_array.py
- modin/pandas/test/test_rolling.py
- modin/pandas/test/test_concat.py
- modin/pandas/test/test_groupby.py
Expand Down
2 changes: 2 additions & 0 deletions .github/workflows/push.yml
Original file line number Diff line number Diff line change
Expand Up @@ -297,6 +297,7 @@ jobs:
- run: python -m pytest -n 2 modin/pandas/test/dataframe/test_window.py
- run: python -m pytest -n 2 modin/pandas/test/dataframe/test_pickle.py
- run: python -m pytest -n 2 modin/pandas/test/test_series.py
- run: python -m pytest -n 2 modin/numpy/test/test_array.py
- run: python -m pytest -n 2 modin/pandas/test/test_rolling.py
- run: python -m pytest -n 2 modin/pandas/test/test_concat.py
if: matrix.engine == 'python'
Expand Down Expand Up @@ -334,6 +335,7 @@ jobs:
- modin/pandas/test/dataframe/test_window.py
- modin/pandas/test/dataframe/test_pickle.py
- modin/pandas/test/test_series.py
- modin/numpy/test/test_array.py
- modin/pandas/test/test_rolling.py
- modin/pandas/test/test_concat.py
- modin/pandas/test/test_groupby.py
Expand Down
7 changes: 7 additions & 0 deletions modin/config/envvars.py
Original file line number Diff line number Diff line change
Expand Up @@ -628,6 +628,13 @@ class TestReadFromPostgres(EnvironmentVariable, type=bool):
default = False


class ExperimentalNumPyAPI(EnvironmentVariable, type=bool):
"""Set to true to use Modin's experimental NumPy API."""

varname = "MODIN_EXPERIMENTAL_NUMPY_API"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

default = False


class ReadSqlEngine(EnvironmentVariable, type=str):
"""Engine to run `read_sql`."""

Expand Down
126 changes: 126 additions & 0 deletions modin/numpy/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,126 @@
# Licensed to Modin Development Team under one or more contributor license agreements.
# See the NOTICE file distributed with this work for additional information regarding
# copyright ownership. The Modin Development Team licenses this file to you under the
# Apache License, Version 2.0 (the "License"); you may not use this file except in
# compliance with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software distributed under
# the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF
# ANY KIND, either express or implied. See the License for the specific language
# governing permissions and limitations under the License.

from .arr import array

from .array_creation import (
zeros_like,
ones_like,
)

from .array_shaping import (
ravel,
shape,
transpose,
)

from .math import (
absolute,
abs,
add,
divide,
float_power,
floor_divide,
power,
prod,
multiply,
remainder,
mod,
subtract,
sum,
true_divide,
mean,
maximum,
amax,
max,
minimum,
amin,
min,
)

from .constants import (
Inf,
Infinity,
NAN,
NINF,
NZERO,
NaN,
PINF,
PZERO,
e,
euler_gamma,
inf,
infty,
nan,
newaxis,
pi,
)


def where(condition, x=None, y=None):
if condition:
RehanSD marked this conversation as resolved.
Show resolved Hide resolved
return x
if not condition:
return y
if hasattr(condition, "where"):
return condition.where(x=x, y=y)
raise NotImplementedError(
f"np.where for condition of type {type(condition)} is not yet supported in Modin."
)


__all__ = [ # noqa: F405
"array",
"zeros_like",
"ones_like",
"ravel",
"shape",
"transpose",
"absolute",
"abs",
"add",
"divide",
"float_power",
"floor_divide",
"power",
"prod",
"multiply",
"remainder",
"mod",
"subtract",
"sum",
"true_divide",
"mean",
"maximum",
"amax",
"max",
"minimum",
"amin",
"min",
"where",
"Inf",
"Infinity",
"NAN",
"NINF",
"NZERO",
"NaN",
"PINF",
"PZERO",
"e",
"euler_gamma",
"inf",
"infty",
"nan",
"newaxis",
"pi",
]
Loading