Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add gpu marker and test both Classic/dask-expr Dask DataFrames #1341

Merged
merged 52 commits into from
May 30, 2024
Merged
Show file tree
Hide file tree
Changes from 50 commits
Commits
Show all changes
52 commits
Select commit Hold shift + click to select a range
731d5f8
Parallelize the test suite
hoxbro May 22, 2024
39ef510
Updates
hoxbro May 22, 2024
f9357e2
Disable benchmark warnings for tests and add par. to nojit test
hoxbro May 22, 2024
38abbb2
Skip failing test (for now)
hoxbro May 22, 2024
f2cd9a0
Also do it for examples
hoxbro May 22, 2024
b6877cf
Skip tiling example
hoxbro May 22, 2024
3c48b59
Fix failing test
hoxbro May 22, 2024
79b5883
Add benchmark skip to examples
hoxbro May 22, 2024
dbbd6ed
Remove pip tests
hoxbro May 23, 2024
e4033b5
Remove pyctdev lint
hoxbro May 23, 2024
e276239
Update codecov to use the action
hoxbro May 23, 2024
0ebca85
Update report type for codecov
hoxbro May 23, 2024
9b801d9
Merge branch 'main' into parallel
hoxbro May 23, 2024
e67db59
Merge branch 'parallel' into update_test_run
hoxbro May 23, 2024
a706938
Update numpy 2 deps
hoxbro May 23, 2024
384feb8
Merge branch 'parallel' into update_test_run
hoxbro May 23, 2024
6206825
Update numpy2 deps
hoxbro May 23, 2024
faf0f0a
POC of how to add GPU marker and dask-expr
hoxbro May 23, 2024
11c0088
Remove other fixture in file
hoxbro May 23, 2024
5ec0b8e
Remove DataFrame
hoxbro May 23, 2024
bc583b8
Handle dask_cudf not working cases
hoxbro May 23, 2024
ffc09f2
Fix markers -> marks
hoxbro May 23, 2024
1c3b012
Fix
hoxbro May 23, 2024
77a4aea
Add dask_expr to pipeline
hoxbro May 23, 2024
d022102
Update test_dask
hoxbro May 23, 2024
3854f60
Use classic_dd for geo
hoxbro May 23, 2024
ae64c6f
Use sys.modules
hoxbro May 23, 2024
dca091a
Update
hoxbro May 23, 2024
cbcf9b3
Update with request.node.name and skip if dask-expr is not available
hoxbro May 23, 2024
dc0be15
Updates
hoxbro May 23, 2024
9bff8b6
Update name to not overwrite cache
hoxbro May 23, 2024
ce7ad2f
Use _dd_switcher
hoxbro May 23, 2024
cbd8e5a
Clean up
hoxbro May 24, 2024
7061acc
Update test_canvas.py
hoxbro May 24, 2024
6558b91
Update _classic_dd
hoxbro May 24, 2024
77fc651
Update conftest.py
hoxbro May 24, 2024
eae2853
Update datashader/tests/conftest.py
hoxbro May 24, 2024
2d4b94e
Need all custom mark to collect
hoxbro May 24, 2024
9197073
Reduce partition to 1, 2, and 4
hoxbro May 24, 2024
dcffff3
Update pandas to have GPU marker
hoxbro May 24, 2024
743d0c0
First pass at transfer_functions
hoxbro May 24, 2024
a4a5f39
Update the rest of the code
hoxbro May 24, 2024
00d7bda
Update test_xarray
hoxbro May 24, 2024
1f5234e
Clean up
hoxbro May 24, 2024
c427e57
Try to not be sensitive to default value of dask.expr for tests
hoxbro May 25, 2024
85454e9
Mark spatialpandas test as classic only
hoxbro May 25, 2024
eb826a5
Check dask_expr behavior for dask_cudf
hoxbro May 26, 2024
66d304d
Merge branch 'main' into update_test_run
hoxbro May 27, 2024
70f2f6f
Merge branch 'update_test_run' into gpu_marker
hoxbro May 27, 2024
ce7e3b0
Clean up
hoxbro May 27, 2024
35d17cd
Merge branch 'main' into gpu_marker
hoxbro May 30, 2024
751095d
Satisfy pre-commit
hoxbro May 30, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 0 additions & 5 deletions .coveragerc

This file was deleted.

14 changes: 0 additions & 14 deletions .github/codecov.yml

This file was deleted.

62 changes: 4 additions & 58 deletions .github/workflows/test.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,6 @@ env:
VECLIB_MAXIMUM_THREADS: 1
NUMEXPR_NUM_THREADS: 1
PYDEVD_DISABLE_FILE_VALIDATION: 1
DASK_DATAFRAME__QUERY_PLANNING: false

jobs:
pre_commit:
Expand Down Expand Up @@ -159,11 +158,6 @@ jobs:
conda activate test-environment
python -c "import numba; print('Numba', numba.__version__)"
python -c "import numpy; print('Numpy', numpy.__version__)"
- name: doit test_lint
if: runner.os == 'Linux'
run: |
conda activate test-environment
doit test_lint
- name: doit test_unit
run: |
conda activate test-environment
Expand All @@ -175,59 +169,11 @@ jobs:
env:
NUMBA_DISABLE_JIT: 1
- name: doit test_examples
env:
DASK_DATAFRAME__QUERY_PLANNING: false
run: |
conda activate test-environment
doit test_examples
- name: codecov
run: |
conda activate test-environment
codecov

test_pip:
name: Pip tests on ${{ matrix.os }} with Python ${{ matrix.python-version }}
needs: [pre_commit]
runs-on: ${{ matrix.os }}
strategy:
fail-fast: false
matrix:
os: ['ubuntu-latest', 'macos-latest']
python-version: ["3.12"]
steps:
- name: Checkout source
uses: actions/checkout@v3
- uses: codecov/codecov-action@v4
with:
fetch-depth: 0
- name: Install Python ${{ matrix.python-version }}
uses: actions/setup-python@v4
with:
python-version: ${{ matrix.python-version }}
- name: Update setuptools
run: |
pip install --upgrade setuptools
- name: Install pyctdev
run: |
pip install pyctdev
- name: doit develop_install
run: |
doit ecosystem=pip develop_install -o tests -o examples
- name: doit env_capture
run: |
doit ecosystem=pip env_capture
- name: doit test_lint
if: runner.os == 'Linux'
run: |
doit ecosystem=pip test_lint
- name: doit test_unit
run: |
doit ecosystem=pip test_unit
- name: doit test_unit_nojit
run: |
doit ecosystem=pip test_unit_nojit
env:
NUMBA_DISABLE_JIT: 1
- name: doit test_examples
run: |
doit ecosystem=pip test_examples
- name: codecov
run: |
codecov
token: ${{ secrets.CODECOV_TOKEN }}
9 changes: 7 additions & 2 deletions datashader/data_libraries/cudf.py
Original file line number Diff line number Diff line change
@@ -1,9 +1,14 @@
from __future__ import annotations
from contextlib import suppress
from datashader.data_libraries.pandas import default
from datashader.core import bypixel
import cudf


@bypixel.pipeline.register(cudf.DataFrame)
def cudf_pipeline(df, schema, canvas, glyph, summary, *, antialias=False):
return default(glyph, df, schema, canvas, summary, antialias=antialias, cuda=True)


with suppress(ImportError):
import cudf

cudf_pipeline = bypixel.pipeline.register(cudf.DataFrame)(cudf_pipeline)
12 changes: 11 additions & 1 deletion datashader/data_libraries/dask.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,7 @@
from __future__ import annotations

from contextlib import suppress

import numpy as np
import pandas as pd
import dask
Expand Down Expand Up @@ -30,7 +32,6 @@ def _dask_compat(df):
return getattr(df, 'optimize', lambda: df)()


@bypixel.pipeline.register(dd.DataFrame)
def dask_pipeline(df, schema, canvas, glyph, summary, *, antialias=False, cuda=False):
dsk, name = glyph_dispatch(glyph, df, schema, canvas, summary, antialias=antialias, cuda=cuda)

Expand All @@ -50,6 +51,15 @@ def dask_pipeline(df, schema, canvas, glyph, summary, *, antialias=False, cuda=F
return scheduler(dsk, name)


# Classic Dask.DataFrame
bypixel.pipeline.register(dd.core.DataFrame)(dask_pipeline)

with suppress(ImportError):
import dask_expr

bypixel.pipeline.register(dask_expr.DataFrame)(dask_pipeline)


def shape_bounds_st_and_axis(df, canvas, glyph):
if not canvas.x_range or not canvas.y_range:
x_extents, y_extents = glyph.compute_bounds_dask(df)
Expand Down
9 changes: 7 additions & 2 deletions datashader/data_libraries/dask_cudf.py
Original file line number Diff line number Diff line change
@@ -1,9 +1,14 @@
from __future__ import annotations
from contextlib import suppress
from datashader.data_libraries.dask import dask_pipeline
from datashader.core import bypixel
import dask_cudf


@bypixel.pipeline.register(dask_cudf.DataFrame)
def dask_cudf_pipeline(df, schema, canvas, glyph, summary, *, antialias=False):
return dask_pipeline(df, schema, canvas, glyph, summary, antialias=antialias, cuda=True)


with suppress(ImportError):
import dask_cudf

dask_cudf_pipeline = bypixel.pipeline.register(dask_cudf.DataFrame)(dask_cudf_pipeline)
4 changes: 2 additions & 2 deletions datashader/data_libraries/dask_xarray.py
Original file line number Diff line number Diff line change
Expand Up @@ -173,8 +173,8 @@ def dask_raster(glyph, xr_ds, schema, canvas, summary, *, antialias=False, cuda=
src_y0, src_y1 = glyph._compute_bounds_from_1d_centers(
xr_ds, y_name, maybe_expand=False, orient=False
)
xbinsize = float(xr_ds[x_name][1] - xr_ds[x_name][0])
ybinsize = float(xr_ds[y_name][1] - xr_ds[y_name][0])
xbinsize = abs(float(xr_ds[x_name][1] - xr_ds[x_name][0]))
ybinsize = abs(float(xr_ds[y_name][1] - xr_ds[y_name][0]))

# Compute scale/translate
out_h, out_w = shape
Expand Down
2 changes: 1 addition & 1 deletion datashader/datashape/coretypes.py
Original file line number Diff line number Diff line change
Expand Up @@ -254,7 +254,7 @@ def to_numpy_dtype(self):
return np.dtype('datetime64[us]')


_units = set(['ns', 'us', 'ms', 's', 'm', 'h', 'D', 'W', 'M', 'Y'])
_units = ('ns', 'us', 'ms', 's', 'm', 'h', 'D', 'W', 'M', 'Y')


_unit_aliases = {
Expand Down
3 changes: 3 additions & 0 deletions datashader/tests/benchmarks/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
import pytest

pytestmark = pytest.mark.benchmark
7 changes: 2 additions & 5 deletions datashader/tests/benchmarks/test_canvas.py
Original file line number Diff line number Diff line change
@@ -1,12 +1,9 @@
import pytest
import os
import numpy as np
import pandas as pd

import datashader as ds

test_gpu = bool(int(os.getenv("DATASHADER_TEST_GPU", 0)))


@pytest.fixture
def time_series():
Expand All @@ -33,7 +30,7 @@ def test_points(benchmark, time_series):
benchmark(cvs.points, time_series, 'x', 'y')


@pytest.mark.skipif(not test_gpu, reason="DATASHADER_TEST_GPU not set")
@pytest.mark.gpu
@pytest.mark.benchmark(group="canvas")
def test_line_gpu(benchmark, time_series):
from cudf import from_pandas
Expand All @@ -42,7 +39,7 @@ def test_line_gpu(benchmark, time_series):
benchmark(cvs.line, time_series, 'x', 'y')


@pytest.mark.skipif(not test_gpu, reason="DATASHADER_TEST_GPU not set")
@pytest.mark.gpu
@pytest.mark.benchmark(group="canvas")
def test_points_gpu(benchmark, time_series):
from cudf import from_pandas
Expand Down
35 changes: 35 additions & 0 deletions datashader/tests/conftest.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
CUSTOM_MARKS = {"benchmark", "gpu"}


def pytest_addoption(parser):
for marker in sorted(CUSTOM_MARKS):
parser.addoption(
f"--{marker}",
action="store_true",
default=False,
help=f"Run {marker} related tests",
)


def pytest_configure(config):
for marker in sorted(CUSTOM_MARKS):
config.addinivalue_line("markers", f"{marker}: {marker} test marker")


def pytest_collection_modifyitems(config, items):
skipped, selected = [], []
markers = {m for m in CUSTOM_MARKS if config.getoption(f"--{m}")}
empty = not markers
for item in items:
item_marks = set(item.keywords) & CUSTOM_MARKS
if empty and item_marks:
skipped.append(item)
elif empty:
selected.append(item)
elif not empty and item_marks == markers:
selected.append(item)
else:
skipped.append(item)

config.hook.pytest_deselected(items=skipped)
items[:] = selected
Loading
Loading