Raise warnings as errors in the test suite (#12468)

This PR fixes a final few warnings introduced in PRs merged after #12406 or with changes that weren't merged into #12406. More importantly, this PR updates the testing suite to always raise uncaught warnings as errors. This is a substantial change to the testing suite that will allow us to proactively react to deprecations in our dependencies, identify areas where pandas/pyarrow/some other dependency handles input data in different ways than us, etc. I've opted for the strictest possible setting, but am open to reviewer requests to be a little more lax with certain classes of warnings. Authors: - Vyas Ramasubramani (https://github.com/vyasr) Approvers: - Bradley Dice (https://github.com/bdice) - Matthew Roeschke (https://github.com/mroeschke) URL: #12468
rapidsai · Jan 5, 2023 · 390f817 · 390f817
1 parent cf26425
commit 390f817
Show file tree

Hide file tree

Showing 5 changed files with 54 additions and 5 deletions.
diff --git a/docs/cudf/source/developer_guide/contributing_guide.md b/docs/cudf/source/developer_guide/contributing_guide.md
@@ -136,3 +136,22 @@ Standard C++ natively supports various [exception types](https://en.cppreference
 which Cython maps to [these Python exception types](https://docs.cython.org/en/latest/src/userguide/wrapping_CPlusPlus.html#exceptions).
 In the future, libcudf may employ custom C++ exception types.
 If that occurs, this section will be updated to reflect how these may be mapped to desired Python exception types.
+
+### Raising warnings
+
+Where appropriate, cuDF should throw the same warnings as pandas.
+For instance, API deprecations in cuDF should track pandas as closely as possible.
+However, not all pandas warnings are appropriate for cuDF.
+Common exceptional cases include
+[implementation-dependent performance warnings](https://pandas.pydata.org/docs/reference/api/pandas.errors.PerformanceWarning.html)
+that do not apply to cuDF's internals.
+The decision of whether or not to match pandas must be made on a case-by-case
+basis and is left to the best judgment of developers and PR reviewers.
+
+### Catching warnings from dependencies
+
+cuDF developers should avoid using deprecated APIs from package dependencies.
+However, there may be cases where such uses cannot be avoided, at least in the short term.
+If such cases arise, developers should
+[catch the warnings](https://docs.python.org/3/library/warnings.html#warnings.catch_warnings)
+within cuDF so that they are not visible to the user.
diff --git a/docs/cudf/source/developer_guide/testing.md b/docs/cudf/source/developer_guide/testing.md
@@ -218,6 +218,29 @@ This way, when the bug is fixed, the test suite will fail at this
 point (and we will remember to update the test).
 
 
+### Testing code that throws warnings
+
+Some code may be expected to throw warnings.
+A common example is when a cudf API is deprecated for future removal, but many other possibilities exist as well.
+The cudf testing suite [surfaces all warnings as errors](https://docs.pytest.org/en/latest/how-to/capture-warnings.html#controlling-warnings).
+This includes warnings raised from non-cudf code, such as calls to pandas or pyarrow.
+This setting forces developers to proactively deal with deprecations from other libraries,
+as well as preventing the internal use of deprecated cudf APIs in other parts of the library.
+Just as importantly, it can help catch real errors like integer overflow or division by zero.
+
+When testing code that is expected to throw a warnings, developers should use the
+[`pytest.warns`](https://docs.pytest.org/en/latest/how-to/capture-warnings.html#assertwarnings) context to catch the warning.
+For parametrized tests that raise warnings under specific conditions, use the `testing._utils.expect_warning_if` decorator instead of `pytest.warns`.
+
+```{warning}
+[`warnings.catch_warnings`](https://docs.python.org/3/library/warnings.html#warnings.catch_warnings)
+is a tempting alternative to `pytest.warns`.
+**Do not use this context manager in tests.**
+Unlike `pytest.warns`, which _requires_ that the expected warning be raised,
+`warnings.catch_warnings` simply catches warnings that appear without requiring them.
+The cudf testing suite should avoid such ambiguities.
+```
+
 ### Testing utility functions
 
 The `cudf.testing` subpackage provides a handful of utilities for testing the equality of objects.

diff --git a/python/cudf/cudf/io/parquet.py b/python/cudf/cudf/io/parquet.py
@@ -1,4 +1,4 @@
-# Copyright (c) 2019-2022, NVIDIA CORPORATION.
+# Copyright (c) 2019-2023, NVIDIA CORPORATION.
 
 import math
 import shutil
@@ -288,7 +288,7 @@ def _process_dataset(
 
     # Convert filters to ds.Expression
     if filters is not None:
-        filters = pq._filters_to_expression(filters)
+        filters = pq.filters_to_expression(filters)
 
     # Initialize ds.FilesystemDataset
     # TODO: Remove the if len(paths) workaround after following bug is fixed:

diff --git a/python/cudf/cudf/tests/pytest.ini b/python/cudf/cudf/tests/pytest.ini
@@ -4,3 +4,7 @@
 markers =
     spilling: mark benchmark a good candidate to run with `CUDF_SPILL=ON`
 xfail_strict = true
+filterwarnings =
+    error
+    ignore:::.*xdist.*
+    ignore:::.*pytest.*
diff --git a/python/cudf/cudf/tests/test_reductions.py b/python/cudf/cudf/tests/test_reductions.py
@@ -122,8 +122,10 @@ def test_sum_of_squares(dtype, nelem):
     sr = Series(data)
     df = cudf.DataFrame(sr)
 
-    got = sr.sum_of_squares()
-    got_df = df.sum_of_squares()
+    with pytest.warns(FutureWarning):
+        got = sr.sum_of_squares()
+    with pytest.warns(FutureWarning):
+        got_df = df.sum_of_squares()
     expect = (data**2).sum()
 
     if cudf.dtype(dtype).kind in {"u", "i"}:
@@ -156,7 +158,8 @@ def test_sum_of_squares_decimal(dtype):
     data = [str(x) for x in gen_rand("int8", 3) / 10]
 
     expected = pd.Series([Decimal(x) for x in data]).pow(2).sum()
-    got = cudf.Series(data).astype(dtype).sum_of_squares()
+    with pytest.warns(FutureWarning):
+        got = cudf.Series(data).astype(dtype).sum_of_squares()
 
     assert_eq(expected, got)