Specialization for accurate complex summation in sum()? #121149

skirpichev · 2024-06-29T09:38:25Z

Feature or enhancement

Proposal:

Currently, sum() builtin lacks any specialization for complex numbers, yet it's usually faster than better pure-python alternatives.

benchmark sum() wrt pure-python version

# a.py
from random import random, seed
seed(1)
data = [complex(random(), random()) for _ in range(10)]
def msum(xs):
    it = iter(xs)
    res = next(it)
    for z in it:
        res += z
    return res
def sum2(xs):
    return complex(sum(_.real for _ in xs),
                   sum(_.imag for _ in xs))

$ ./python -m timeit -r11 -unsec -s 'from a import data, msum' 'sum(data)'
500000 loops, best of 11: 963 nsec per loop
$ ./python -m timeit -r11 -unsec -s 'from a import data, msum' 'msum(data)'
200000 loops, best of 11: 1.31e+03 nsec per loop

Hardly using sum() component-wise is an option:

$ ./python -m timeit -r11 -unsec -s 'from a import data, sum2' 'sum2(data)'
50000 loops, best of 11: 8.56e+03 nsec per loop

Unfortunately, direct using this builtin in numeric code doesn't make sense, as results are (usually) inaccurate. It's not too hard to do summation component-wise with math.fsum(), but it's slow and there might be a better way.

In #100425 simple algorithm, using compensated summation, was implemented in sum() for floats. I propose (1) make specialization in sum() for complex numbers, and (2) reuse #100425 code to implement accurate summation of complexes.

(1) is simple and strightforward, yet it will give some measurable performance boost

benchmark sum() in the main wrt added specialization for complex

diff --git a/Python/bltinmodule.c b/Python/bltinmodule.c
index 6e50623caf..da0eed584a 100644
--- a/Python/bltinmodule.c
+++ b/Python/bltinmodule.c
@@ -2691,6 +2691,59 @@ builtin_sum_impl(PyObject *module, PyObject *iterable, PyObject *start)
             }
         }
     }
+
+    if (PyComplex_CheckExact(result)) {
+        Py_complex c_result = PyComplex_AsCComplex(result);
+        Py_SETREF(result, NULL);
+        while(result == NULL) {
+            item = PyIter_Next(iter);
+            if (item == NULL) {
+                Py_DECREF(iter);
+                if (PyErr_Occurred())
+                    return NULL;
+                return PyComplex_FromCComplex(c_result);
+            }
+            if (PyComplex_CheckExact(item)) {
+                Py_complex x = PyComplex_AsCComplex(item);
+                c_result.real += x.real;
+                c_result.imag += x.imag;
+                _Py_DECREF_SPECIALIZED(item, _PyFloat_ExactDealloc);
+                continue;
+            }
+            if (PyLong_Check(item)) {
+                long value;
+                int overflow;
+                value = PyLong_AsLongAndOverflow(item, &overflow);
+                if (!overflow) {
+                    c_result.real += (double)value;
+                    c_result.imag += 0.0;
+                    Py_DECREF(item);
+                    continue;
+                }
+            }
+            if (PyFloat_Check(item)) {
+                double value = PyFloat_AS_DOUBLE(item);
+                c_result.real += value;
+                c_result.imag += 0.0;
+                Py_DECREF(item);
+                continue;
+            }
+            result = PyComplex_FromCComplex(c_result);
+            if (result == NULL) {
+                Py_DECREF(item);
+                Py_DECREF(iter);
+                return NULL;
+            }
+            temp = PyNumber_Add(result, item);
+            Py_DECREF(result);
+            Py_DECREF(item);
+            result = temp;
+            if (result == NULL) {
+                Py_DECREF(iter);
+                return NULL;
+            }
+        }
+    }
 #endif
 
     for(;;) {

main:

$ ./python -m timeit -r11 -unsec -s 'from a import data, msum' 'sum(data)'
500000 loops, best of 11: 963 nsec per loop

with specialization:

$ ./python -m timeit -r11 -unsec -s 'from a import data, msum' 'sum(data)'
500000 loops, best of 11: 606 nsec per loop

(2) also seems to be a no-brain task: simple refactoring of PyFloat specialization should allows us use same core for PyComplex specialization.

If there are no objections against - I'll work on a patch.

Has this already been discussed elsewhere?

This is a minor feature, which does not need previous discussion elsewhere

Links to previous discussion of this feature:

No response

Linked PRs

gh-121149: improve accuracy of builtin sum() for complex inputs #121176

The text was updated successfully, but these errors were encountered:

Eclips4 · 2024-06-29T10:00:43Z

cc @rhettinger @mdickinson

franklinvp · 2024-06-29T12:45:08Z

My opinion is that the goal of being accurate should be delegated to a specialized function and that the built in function sum should return to have its simple-to-explain meaning of "adding from left to right using the addition that the elements define".

The change introduced in #100425 increased the complexity of explaining what sum does. See some examples.

Sum of finite precision floating point has its surprises, when one haven't learned that it is not real numbers what is being added. But in my opinion, the built in sum is not the right place to mitigate that. With the aim of explicit is better than implicit, I think it is better that the use of a compensated sum should be in done in a function that is explicitly made for that purpose. Didactically, those who haven't learned floating point arithmetic, should be exposed to it in the immediate tools of the language + and built in sum.

rhettinger · 2024-06-29T17:18:36Z

If there are not objections against - I'll work on a patch.

+1 from me. As a test, the result should match complex(sum(z.real for z in zarr), sum(z.imag for z in zarr))

skirpichev · 2024-06-30T03:15:07Z

its simple-to-explain meaning of "adding from left to right using the addition that the elements define".

One reason for builtin function to be a builtin - well, be useful. Just be "simple-to-explain" is not enough.

What is a point in preserving huge accumulated rounding error? Say, if you want to reproduce inaccurate sum with some specific order of operations - you can just use something like functools.reduce(operator.add, xs) or an explicit for loop.

Also, is it "simple-to-explain" for you when sum(xs) and sum(reversed(xs)) are different?

use of a compensated sum should be in done in a function that is explicitly made for that purpose.

There is also math.fsum(), it's mentioned in sphinx docs for sum(). But it uses much more complex algorithm.

(BTW, it looks faster on random data.)

+1 from me.

FYI, pr is ready for review: #121176

…21176)

franklinvp · 2024-07-05T15:52:19Z

Clap, clap, clap.

Everyone implicitly being forced to opt in doing a compensated sum for float and complex, instead of explicitly doing so.
So long explicit is better than implicit.
Thanks for ruining one use case, when this algorithm could have perfectly be put either somewhere else or made it switchable, everyone benefitted without anyone being affected.

Please don't ever "enhance" it to subtypes. Those who need a fast and proper sum of float and complex in Python at least still have some hacks like

class R(float): pass
class C(complex): pass
sum(L, start=R(0.0)) # or sum(L, start=C(0.0))

Regarding

Also, is it "simple-to-explain" for you when sum(xs) and sum(reversed(xs)) are different?

That is the type of thing that you want to have to explain.
It is a good thing having people come over and over asking about it.
That phenomenon is a feature of finite precision floating point arithmetic. They get to learn it.
Python, a language used so often for education, now tries to hide this as early as its built in function sum, catering to ignorance.

rhettinger · 2024-07-05T21:18:32Z

Clap, clap, clap.
... [sarcasm] ...
catering to ignorance.

Your abusive posts are not welcome here.

skirpichev · 2024-07-06T03:31:30Z

Please don't ever "enhance" it to subtypes.

Probably, this doesn't make sense just because __add__() can be changed for subtypes:

>>> class myint(int):
...     def __add__(self, other):
...         print("Nobody expects the Spanish Inquisition!")
...         
>>> myint()+myint()
Nobody expects the Spanish Inquisition!

I can't imagine many practical cases to do that... Perhaps, implementing some parallel algorithm for addition of big ints?

But that's all. More generic checks (say, PyFloat_Check()) are costly in case of negative results. But in sum() we stop specialization of given type and fall though to more generic one (say, to complex) or to using PyNumber_Add(). So, I think that more generic checks are cheap in this type of workflow.

@rhettinger, is this extension of specialization in sum() does make sense for you?

If not, please note that after 88bdb92 we have PyLong_Check()'s in some places. I think that was a bug (CC @serhiy-storchaka). This was extended in my pr with one PyFloat_Check() and I forgot to fix that, sorry:(

a patch to fix few instances of "soft" type checks

diff --git a/Python/bltinmodule.c b/Python/bltinmodule.c
index a5b45e358d..49675c224c 100644
--- a/Python/bltinmodule.c
+++ b/Python/bltinmodule.c
@@ -2686,7 +2686,7 @@ builtin_sum_impl(PyObject *module, PyObject *iterable, PyObject *start)
                 _Py_DECREF_SPECIALIZED(item, _PyFloat_ExactDealloc);
                 continue;
             }
-            if (PyLong_Check(item)) {
+            if (PyLong_CheckExact(item) || PyBool_Check(item)) {
                 long value;
                 int overflow;
                 value = PyLong_AsLongAndOverflow(item, &overflow);
@@ -2735,7 +2735,7 @@ builtin_sum_impl(PyObject *module, PyObject *iterable, PyObject *start)
                 Py_DECREF(item);
                 continue;
             }
-            if (PyLong_Check(item)) {
+            if (PyLong_CheckExact(item) || PyBool_Check(item)) {
                 long value;
                 int overflow;
                 value = PyLong_AsLongAndOverflow(item, &overflow);
@@ -2746,7 +2746,7 @@ builtin_sum_impl(PyObject *module, PyObject *iterable, PyObject *start)
                     continue;
                 }
             }
-            if (PyFloat_Check(item)) {
+            if (PyFloat_CheckExact(item)) {
                 double value = PyFloat_AS_DOUBLE(item);
                 re_sum.hi += value;
                 im_sum.hi += 0.0;

Python, a language used so often for education, now tries to hide

Not at all!

>>> res, xs = 0.0, [0.1]*10
>>> for _ in xs:
...     res += _
...     
>>> res
0.9999999999999999

serhiy-storchaka · 2024-07-06T07:01:40Z

If not, please note that after 88bdb92 we have PyLong_Check()'s in some places. I think that was a bug (CC @serhiy-storchaka).

Do you have an example where this makes difference?

skirpichev · 2024-07-06T07:56:03Z

Hmm, you are right, sorry for a false alarm. All such examples looks broken.

…pythongh-121176)

After python/cpython#121149 this was broken on CPython 3.14: _______________ [doctest] mpmath.functions.orthogonal.spherharm ____________ EXAMPLE LOCATION UNKNOWN, not showing all tests of that example ??? >>> fp.chop(fp.quad(lambda t,p: Y1(t,p)*Y2(t,p)*dS(t,p), *sphere)) Expected: 1.0000000000000007 Got: 1.0000000000000004 This particular failure could be fixed by: diff --git a/mpmath/ctx_base.py b/mpmath/ctx_base.py index 3309bfa..2ca2fac 100644 --- a/mpmath/ctx_base.py +++ b/mpmath/ctx_base.py @@ -110,7 +110,8 @@ def fdot(ctx, xs, ys=None, conjugate=False): cf = ctx.conj return sum((x*cf(y) for (x,y) in xs), ctx.zero) else: - return sum((x*y for (x,y) in xs), ctx.zero) + return ctx.mpc(sum(((x*y).real for (x,y) in xs), ctx.zero), + sum(((x*y).imag for (x,y) in xs), ctx.zero)) def fprod(ctx, args): prod = ctx.one But this is just one case for the whole test suite...

skirpichev added the type-feature A feature request or enhancement label Jun 29, 2024

skirpichev added a commit to skirpichev/cpython that referenced this issue Jun 30, 2024

pythongh-121149: improve accuracy of builtin sum() for complex inputs

e69b14a

bedevere-app bot mentioned this issue Jun 30, 2024

gh-121149: improve accuracy of builtin sum() for complex inputs #121176

Merged

gpshead added the 3.14 new features, bugs and security fixes label Jul 2, 2024

rhettinger pushed a commit that referenced this issue Jul 5, 2024

gh-121149: improve accuracy of builtin sum() for complex inputs (gh-1…

d4faa7b

…21176)

rhettinger closed this as completed Jul 5, 2024

noahbkim pushed a commit to hudson-trading/cpython that referenced this issue Jul 11, 2024

pythongh-121149: improve accuracy of builtin sum() for complex inputs (…

2a1438f

…pythongh-121176)

estyxx pushed a commit to estyxx/cpython that referenced this issue Jul 17, 2024

pythongh-121149: improve accuracy of builtin sum() for complex inputs (…

ab5e59a

…pythongh-121176)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Specialization for accurate complex summation in sum()? #121149

Specialization for accurate complex summation in sum()? #121149

skirpichev commented Jun 29, 2024 •

edited by bedevere-app bot

Loading

Eclips4 commented Jun 29, 2024 •

edited

Loading

franklinvp commented Jun 29, 2024

rhettinger commented Jun 29, 2024 •

edited

Loading

skirpichev commented Jun 30, 2024

franklinvp commented Jul 5, 2024

rhettinger commented Jul 5, 2024

skirpichev commented Jul 6, 2024

serhiy-storchaka commented Jul 6, 2024

skirpichev commented Jul 6, 2024

Specialization for accurate complex summation in sum()? #121149

Specialization for accurate complex summation in sum()? #121149

Comments

skirpichev commented Jun 29, 2024 • edited by bedevere-app bot Loading

Feature or enhancement

Proposal:

Has this already been discussed elsewhere?

Links to previous discussion of this feature:

Linked PRs

Eclips4 commented Jun 29, 2024 • edited Loading

franklinvp commented Jun 29, 2024

rhettinger commented Jun 29, 2024 • edited Loading

skirpichev commented Jun 30, 2024

franklinvp commented Jul 5, 2024

rhettinger commented Jul 5, 2024

skirpichev commented Jul 6, 2024

serhiy-storchaka commented Jul 6, 2024

skirpichev commented Jul 6, 2024

skirpichev commented Jun 29, 2024 •

edited by bedevere-app bot

Loading

Eclips4 commented Jun 29, 2024 •

edited

Loading

rhettinger commented Jun 29, 2024 •

edited

Loading