Merge pull request #1615 from MichaelTiemannOSC/parse-uncertainties

This commit allows to parse uncertain numbers e.g. (1.0+/-0.2)e+03 Enable Pint to consume uncertain quantities. Signed-off-by: 72577720+MichaelTiemannOSC@users.noreply.github.com * Fix problems identified by python -m pre_commit run --all-files Signed-off-by: MichaelTiemann <72577720+MichaelTiemannOSC@users.noreply.github.com> * Enhance support for `uncertainties`. See #1611, #1614. Signed-off-by: MichaelTiemann <72577720+MichaelTiemannOSC@users.noreply.github.com> * Fix up failures and errors found by test suite. Signed-off-by: MichaelTiemann <72577720+MichaelTiemannOSC@users.noreply.github.com> * Copy in changes from PR1596 Signed-off-by: 72577720+MichaelTiemannOSC@users.noreply.github.com * Create modular uncertainty parser layer Based on feedback, tokenize uncertainties on top of default tokenizer, not instead of default tokenizer. Signed-off-by: MichaelTiemann <72577720+MichaelTiemannOSC@users.noreply.github.com> * Fix conflict merge error Signed-off-by: Michael Tiemann <72577720+MichaelTiemannOSC@users.noreply.github.com> * Update util.py Fixes problems parsing currency symbols that also show up when dealing with uncertainties. Signed-off-by: Michael Tiemann <72577720+MichaelTiemannOSC@users.noreply.github.com> * Update pint_eval.py Handle negative numbers using uncertainty parenthesis notation. Signed-off-by: Michael Tiemann <72577720+MichaelTiemannOSC@users.noreply.github.com> * Update pint_eval.py Ahem...use walrus operator for side-effect, not truth value. Signed-off-by: Michael Tiemann <72577720+MichaelTiemannOSC@users.noreply.github.com> * Fixed to work with both + and - e notation in the actually processing of the exponent, not just in the parsing of the exponent. i.e., (5.01+/-0.07)e+04 Signed-off-by: Michael Tiemann <72577720+MichaelTiemannOSC@users.noreply.github.com> * Fix test suite failures Manually fix test_issue_1400. Let other failures (which are not related to uncertainties) fail. Signed-off-by: Michael Tiemann <72577720+MichaelTiemannOSC@users.noreply.github.com> * Fix tokenizer merge error in pint/util.py When using pint_eval.tokenizer don't try to import tokenizer from pint.compat. Signed-off-by: Michael Tiemann <72577720+MichaelTiemannOSC@users.noreply.github.com> * Merge cleanup: pint_eval.py needs tokenize Clean up merge import error. Signed-off-by: Michael Tiemann <72577720+MichaelTiemannOSC@users.noreply.github.com> * Make black happier Run `black` with default arguments to try to match whatever `black` wants to see in the CI/CD world. Signed-off-by: Michael Tiemann <72577720+MichaelTiemannOSC@users.noreply.github.com> * Make ruff happy Remove unused redefinition of tokenizer in toktest.py. Also remove unnecessary import of pint_eval from top-level (it's imported inside the function definition that needs it). Signed-off-by: Michael Tiemann <72577720+MichaelTiemannOSC@users.noreply.github.com> * Make ruff happier Fix ruff errors missed in previous commit. Signed-off-by: Michael Tiemann <72577720+MichaelTiemannOSC@users.noreply.github.com> * Update toktest.py Fix whitespace error created by `ruff --fix` that `black` didn't like. Signed-off-by: Michael Tiemann <72577720+MichaelTiemannOSC@users.noreply.github.com> * Update test_util.py Follow deprecation of use_decimal from pint/util.py Signed-off-by: Michael Tiemann <72577720+MichaelTiemannOSC@users.noreply.github.com> * Fix additional regressions in test suite If we have the uncertainties library loaded, go ahead and use the uncertainty_tokenizer by default. This fixes problems with standard Pandas tests that expect the tokenizer to do the right thing without any special setup. Also, prevent exception when a loop in consensus_name_attr (pandas-dev/pandas/core/common.py(86))) tests equality with a None argument. Otherwise the zero_or_nan test raises an exception. Signed-off-by: Michael Tiemann <72577720+MichaelTiemannOSC@users.noreply.github.com> * Update quantity.py Teach Pint's PlainQuantity about the Pandas pd.NA value so that ndim works. Otherwise, it naively delegates to NumpyQuantity, which is the road to perdition for PintArrays. Signed-off-by: Michael Tiemann <72577720+MichaelTiemannOSC@users.noreply.github.com> * Make `babel` a dependency for testbase Here's hoping this fixes the CI/CD problem with test_1400. Signed-off-by: Michael Tiemann <72577720+MichaelTiemannOSC@users.noreply.github.com> * Update .readthedocs.yaml Removing `system_packages: false` as suggested by @keewis Signed-off-by: Michael Tiemann <72577720+MichaelTiemannOSC@users.noreply.github.com> * Fix failing tests Fix isnan to use unp.isnan as appropriate for both duck_array_type and objects of UFloat types. Fix a minor typo in pint/facets/__init__.py comment. In test_issue_1400, use decorators to ensure babel library is loaded when needed. pyproject.toml: revert change to testbase; we fixed with decorators instead. Signed-off-by: Michael Tiemann <72577720+MichaelTiemannOSC@users.noreply.github.com> --------- Signed-off-by: 72577720+MichaelTiemannOSC@users.noreply.github.com Signed-off-by: MichaelTiemann <72577720+MichaelTiemannOSC@users.noreply.github.com> Signed-off-by: Michael Tiemann <72577720+MichaelTiemannOSC@users.noreply.github.com>
hgrecco · Sep 15, 2023 · 07646d0 · 07646d0
2 parents 2852f36 + 00f08f3
commit 07646d0
Show file tree

Hide file tree

Showing 16 changed files with 446 additions and 51 deletions.
diff --git a/.readthedocs.yaml b/.readthedocs.yaml
@@ -11,4 +11,3 @@ python:
      - requirements: requirements_docs.txt
      - method: pip
        path: .
-   system_packages: false
diff --git a/CHANGES b/CHANGES
@@ -105,6 +105,12 @@ Pint Changelog
   (Issue #1030, #574)
 - Added angular frequency documentation page.
 - Move ASV benchmarks to dedicated folder. (Issue #1542)
+- An ndim attribute has been added to Quantity and DataFrame has been added to upcast
+  types for pint-pandas compatibility. (#1596)
+- Fix a recursion error that would be raised when passing quantities to `cond` and `x`.
+  (Issue #1510, #1530)
+- Update test_non_int tests for pytest.
+- Better support for uncertainties (See #1611, #1614)
 - Implement `numpy.broadcast_arrays` (#1607)
 - An ndim attribute has been added to Quantity and DataFrame has been added to upcast
 types for pint-pandas compatibility. (#1596)

diff --git a/pint/compat.py b/pint/compat.py
@@ -12,14 +12,21 @@
 
 import sys
 import math
-import tokenize
 from decimal import Decimal
 from importlib import import_module
-from io import BytesIO
 from numbers import Number
 from collections.abc import Mapping
 from typing import Any, NoReturn, Callable, Optional, Union
-from collections.abc import Generator, Iterable
+from collections.abc import Iterable
+
+try:
+    from uncertainties import UFloat, ufloat
+    from uncertainties import unumpy as unp
+
+    HAS_UNCERTAINTIES = True
+except ImportError:
+    UFloat = ufloat = unp = None
+    HAS_UNCERTAINTIES = False
 
 
 if sys.version_info >= (3, 10):
@@ -58,19 +65,6 @@ def _inner(*args: Any, **kwargs: Any) -> NoReturn:
     return _inner
 
 
-def tokenizer(input_string: str) -> Generator[tokenize.TokenInfo, None, None]:
-    """Tokenize an input string, encoded as UTF-8
-    and skipping the ENCODING token.
-
-    See Also
-    --------
-    tokenize.tokenize
-    """
-    for tokinfo in tokenize.tokenize(BytesIO(input_string.encode("utf-8")).readline):
-        if tokinfo.type != tokenize.ENCODING:
-            yield tokinfo
-
-
 # TODO: remove this warning after v0.10
 class BehaviorChangeWarning(UserWarning):
     pass
@@ -83,7 +77,10 @@ class BehaviorChangeWarning(UserWarning):
 
     HAS_NUMPY = True
     NUMPY_VER = np.__version__
-    NUMERIC_TYPES = (Number, Decimal, ndarray, np.number)
+    if HAS_UNCERTAINTIES:
+        NUMERIC_TYPES = (Number, Decimal, ndarray, np.number, UFloat)
+    else:
+        NUMERIC_TYPES = (Number, Decimal, ndarray, np.number)
 
     def _to_magnitude(value, force_ndarray=False, force_ndarray_like=False):
         if isinstance(value, (dict, bool)) or value is None:
@@ -92,6 +89,11 @@ def _to_magnitude(value, force_ndarray=False, force_ndarray_like=False):
             raise ValueError("Quantity magnitude cannot be an empty string.")
         elif isinstance(value, (list, tuple)):
             return np.asarray(value)
+        elif HAS_UNCERTAINTIES:
+            from pint.facets.measurement.objects import Measurement
+
+            if isinstance(value, Measurement):
+                return ufloat(value.value, value.error)
         if force_ndarray or (
             force_ndarray_like and not is_duck_array_type(type(value))
         ):
@@ -144,16 +146,13 @@ def _to_magnitude(value, force_ndarray=False, force_ndarray_like=False):
                 "lists and tuples are valid magnitudes for "
                 "Quantity only when NumPy is present."
             )
-        return value
+        elif HAS_UNCERTAINTIES:
+            from pint.facets.measurement.objects import Measurement
 
+            if isinstance(value, Measurement):
+                return ufloat(value.value, value.error)
+        return value
 
-try:
-    from uncertainties import ufloat
-
-    HAS_UNCERTAINTIES = True
-except ImportError:
-    ufloat = None
-    HAS_UNCERTAINTIES = False
 
 try:
     from babel import Locale
@@ -326,16 +325,25 @@ def isnan(obj: Any, check_all: bool) -> Union[bool, Iterable[bool]]:
         Always return False for non-numeric types.
     """
     if is_duck_array_type(type(obj)):
-        if obj.dtype.kind in "if":
+        if obj.dtype.kind in "ifc":
             out = np.isnan(obj)
         elif obj.dtype.kind in "Mm":
             out = np.isnat(obj)
         else:
-            # Not a numeric or datetime type
-            out = np.full(obj.shape, False)
+            if HAS_UNCERTAINTIES:
+                try:
+                    out = unp.isnan(obj)
+                except TypeError:
+                    # Not a numeric or UFloat type
+                    out = np.full(obj.shape, False)
+            else:
+                # Not a numeric or datetime type
+                out = np.full(obj.shape, False)
         return out.any() if check_all else out
     if isinstance(obj, np_datetime64):
         return np.isnat(obj)
+    elif HAS_UNCERTAINTIES and isinstance(obj, UFloat):
+        return unp.isnan(obj)
     try:
         return math.isnan(obj)
     except TypeError:

diff --git a/pint/facets/__init__.py b/pint/facets/__init__.py
@@ -7,7 +7,7 @@
     keeping each part small enough to be hackable.
 
     Each facet contains one or more of the following modules:
-    - definitions: classes describing an specific unit related definiton.
+    - definitions: classes describing specific unit-related definitons.
       These objects must be immutable, pickable and not reference the registry (e.g. ContextDefinition)
     - objects: classes and functions that encapsulate behavior (e.g. Context)
     - registry: implements a subclass of PlainRegistry or class that can be

diff --git a/pint/facets/measurement/objects.py b/pint/facets/measurement/objects.py
@@ -52,7 +52,7 @@ class Measurement(PlainQuantity):
 
     """
 
-    def __new__(cls, value, error, units=MISSING):
+    def __new__(cls, value, error=MISSING, units=MISSING):
         if units is MISSING:
             try:
                 value, units = value.magnitude, value.units
@@ -64,17 +64,18 @@ def __new__(cls, value, error, units=MISSING):
                     error = MISSING  # used for check below
                 else:
                     units = ""
-        try:
-            error = error.to(units).magnitude
-        except AttributeError:
-            pass
-
         if error is MISSING:
+            # We've already extracted the units from the Quantity above
             mag = value
-        elif error < 0:
-            raise ValueError("The magnitude of the error cannot be negative")
         else:
-            mag = ufloat(value, error)
+            try:
+                error = error.to(units).magnitude
+            except AttributeError:
+                pass
+            if error < 0:
+                raise ValueError("The magnitude of the error cannot be negative")
+            else:
+                mag = ufloat(value, error)
 
         inst = super().__new__(cls, mag, units)
         return inst

diff --git a/pint/facets/numpy/quantity.py b/pint/facets/numpy/quantity.py
@@ -29,6 +29,16 @@
     set_units_ufuncs,
 )
 
+try:
+    import uncertainties.unumpy as unp
+    from uncertainties import ufloat, UFloat
+
+    HAS_UNCERTAINTIES = True
+except ImportError:
+    unp = np
+    ufloat = Ufloat = None
+    HAS_UNCERTAINTIES = False
+
 
 def method_wraps(numpy_func):
     if isinstance(numpy_func, str):
@@ -224,6 +234,11 @@ def __getattr__(self, item) -> Any:
                     )
                 else:
                     raise exc
+        elif (
+            HAS_UNCERTAINTIES and item == "ndim" and isinstance(self._magnitude, UFloat)
+        ):
+            # Dimensionality of a single UFloat is 0, like any other scalar
+            return 0
 
         try:
             return getattr(self._magnitude, item)

diff --git a/pint/facets/plain/quantity.py b/pint/facets/plain/quantity.py
@@ -55,6 +55,17 @@
     if HAS_NUMPY:
         import numpy as np  # noqa
 
+try:
+    import uncertainties.unumpy as unp
+    from uncertainties import ufloat, UFloat
+
+    HAS_UNCERTAINTIES = True
+except ImportError:
+    unp = np
+    ufloat = Ufloat = None
+    HAS_UNCERTAINTIES = False
+
+
 MagnitudeT = TypeVar("MagnitudeT", bound=Magnitude)
 ScalarT = TypeVar("ScalarT", bound=Scalar)
 
@@ -133,6 +144,8 @@ class PlainQuantity(Generic[MagnitudeT], PrettyIPython, SharedRegistryObject):
     def ndim(self) -> int:
         if isinstance(self.magnitude, numbers.Number):
             return 0
+        if str(self.magnitude) == "<NA>":
+            return 0
         return self.magnitude.ndim
 
     @property
@@ -256,7 +269,12 @@ def __bytes__(self) -> bytes:
         return str(self).encode(locale.getpreferredencoding())
 
     def __repr__(self) -> str:
-        if isinstance(self._magnitude, float):
+        if HAS_UNCERTAINTIES:
+            if isinstance(self._magnitude, UFloat):
+                return f"<Quantity({self._magnitude:.6}, '{self._units}')>"
+            else:
+                return f"<Quantity({self._magnitude}, '{self._units}')>"
+        elif isinstance(self._magnitude, float):
             return f"<Quantity({self._magnitude:.9}, '{self._units}')>"
 
         return f"<Quantity({self._magnitude}, '{self._units}')>"
@@ -1288,6 +1306,9 @@ def bool_result(value):
         # We compare to the plain class of PlainQuantity because
         # each PlainQuantity class is unique.
         if not isinstance(other, PlainQuantity):
+            if other is None:
+                # A loop in pandas-dev/pandas/core/common.py(86)consensus_name_attr() can result in OTHER being None
+                return bool_result(False)
             if zero_or_nan(other, True):
                 # Handle the special case in which we compare to zero or NaN
                 # (or an array of zeros or NaNs)

diff --git a/pint/facets/plain/registry.py b/pint/facets/plain/registry.py
@@ -63,8 +63,9 @@
     Handler,
 )
 
+from ... import pint_eval
 from ..._vendor import appdirs
-from ...compat import babel_parse, tokenizer, TypeAlias, Self
+from ...compat import babel_parse, TypeAlias, Self
 from ...errors import DimensionalityError, RedefinitionError, UndefinedUnitError
 from ...pint_eval import build_eval_tree
 from ...util import ParserHelper
@@ -1324,7 +1325,7 @@ def parse_expression(
         for p in self.preprocessors:
             input_string = p(input_string)
         input_string = string_preprocessor(input_string)
-        gen = tokenizer(input_string)
+        gen = pint_eval.tokenizer(input_string)
 
         def _define_op(s: str):
             return self._eval_token(s, case_sensitive=case_sensitive, **values)

diff --git a/pint/formatting.py b/pint/formatting.py
@@ -375,9 +375,13 @@ def formatter(
                     # Don't remove this positional! This is the format used in Babel
                     key = pat.replace("{0}", "").strip()
                     break
-            division_fmt = compound_unit_patterns.get("per", {}).get(
-                babel_length, division_fmt
-            )
+
+            tmp = compound_unit_patterns.get("per", {}).get(babel_length, division_fmt)
+
+            try:
+                division_fmt = tmp.get("compound", division_fmt)
+            except AttributeError:
+                division_fmt = tmp
             power_fmt = "{}{}"
             exp_call = _pretty_fmt_exponent
         if value == 1: