Skip to content

Commit

Permalink
Merge pull request #1381 from mrocklin/get-numeric-data
Browse files Browse the repository at this point in the history
Implement DataFrame._get_numeric_data
  • Loading branch information
kkraus14 authored Apr 9, 2019
2 parents bff6849 + 3d473a3 commit b63e971
Show file tree
Hide file tree
Showing 3 changed files with 20 additions and 1 deletion.
3 changes: 2 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@
- PR #1292 Implemented Bitwise binary ops AND, OR, XOR (&, |, ^)
- PR #1235 Add GPU-accelerated Parquet Reader
- PR #1335 Added local_dict arg in `DataFrame.query()`.
- PR #1381 Add DataFrame._get_numeric_data

## Improvements

Expand All @@ -29,7 +30,7 @@
- PR #1254 CSV Reader: fix data type detection for floating-point numbers in scientific notation
- PR #1289 Fix looping over each value instead of each category in concatenation
- PR #1293 Fix Inaccurate error message in join.pyx
- PR #1308 Add atomicCAS overload for `int8_t`, `int16_t`
- PR #1308 Add atomicCAS overload for `int8_t`, `int16_t`
- PR #1317 Fix catch polymorphic exception by reference in ipc.cu
- PR #1325 Fix dtype of null bitmasks to int8
- PR #1326 Update build documentation to use -DCMAKE_CXX11_ABI=ON
Expand Down
7 changes: 7 additions & 0 deletions python/cudf/dataframe/dataframe.py
Original file line number Diff line number Diff line change
Expand Up @@ -305,6 +305,13 @@ def __array_ufunc__(self, ufunc, method, *inputs, **kwargs):
def empty(self):
return not len(self)

def _get_numeric_data(self):
""" Return a dataframe with only numeric data types """
columns = [c for c, dt in self.dtypes.items()
if dt != object and
not pd.api.types.is_categorical_dtype(dt)]
return self[columns]

def assign(self, **kwargs):
"""
Assign columns to DataFrame from keyword arguments.
Expand Down
11 changes: 11 additions & 0 deletions python/cudf/tests/test_dataframe.py
Original file line number Diff line number Diff line change
Expand Up @@ -2050,3 +2050,14 @@ def test_series_to_gpu_array(nan_value):
s = Series([0, 1, None, 3])
np.testing.assert_array_equal(s.to_array(nan_value),
s.to_gpu_array(nan_value).copy_to_host())


def test_get_numeric_data():
pdf = pd.DataFrame({
'x': [1, 2, 3],
'y': [1., 2., 3.],
'z': ['a', 'b', 'c']
})
gdf = gd.from_pandas(pdf)

assert_eq(pdf._get_numeric_data(), gdf._get_numeric_data())

0 comments on commit b63e971

Please sign in to comment.