implement a better summation algorithm #199

JeffBezanson · 2011-09-19T21:28:49Z

sum should use a better algorithm, or at least we should provide an alternative function that does a better job. Candidates include Kahan summation (http://en.wikipedia.org/wiki/Kahan_summation_algorithm) and pairwise summation.

The text was updated successfully, but these errors were encountered:

StefanKarpinski · 2011-09-20T16:35:48Z

By pairwise summation, I assume you mean recursive pairwise, as in this sort of thing:

sum(x::Vector) = length(x) == 0 ? 0 :
                 length(x) == 1 ? x[1] :
                 sum(x[1:div(end-1,2)]) + sum(x[div(end+1,2):end])

JeffBezanson · 2011-09-20T17:40:48Z

Kahan summation looks promising since it looks like it can be done with just a couple extra arithmetic ops on values already in registers.

StefanKarpinski · 2012-03-15T01:35:44Z

Maybe a keyword option for this: alg="kahan". We could also implement recursive and sorted summation algorithms.

JeffreySarnoff · 2012-07-05T06:43:35Z

sorry about that -- this is the part that matters

# bettersum.jl
#
# bettersum(Vector{Float64}) is more accurate and faster than kahansum()
#
# Jeffrey Sarnoff on 2012-Jul-05



# Kahan's compensated summation
# W. Kahan.
# Further remarks on reducing truncation erros.
# Comm. ACM, 8:40, 1965


function kahansum(x)
    n = length(x)
    if (n==0)  return(0)  end

    s = x[1]
    c = 0
    for i in 2:n
      y = x[i] - c
      t = s + y
      c = (t - s) -y
      s = t
    end
    s
end    


# Kahan and Babuska summation, Neumaier variant
# A. Neumaier.
# Rundungsfehleranalyse einiger Verfahren zur Summation endlicher Summen.
# Math. Mechanik, 54:39–51, 1974.

function bettersum(x)
    n = length(x)
    if (n == 0)   return(0)  end

    s = x[1]
    c = 0
    for i in 2:n
        t = s + x[i]
        if ( abs(s) >= abs(x[i]) )
           c += ( (s-t) + x[i] )
        else
           c += ( (x[i]-t) + s )
        end
        s = t
    end

    s + c
end


# test vector is Tim Peters'
# truesum( vec ) == 2_000.0

vec =  [1,1e100,1,-1e100]*1000

sum(vec)      == 0.0
kahansum(vec) == 0.0
kbnsum(vec)   == 2_000.0

[pao: syntax highlights]

JeffBezanson · 2012-07-06T04:01:42Z

Is kbnsum always better? Maybe we should use this by default for float arrays.

JeffreySarnoff · 2012-07-06T13:50:42Z

kbnsum (above implemented as bettersum, written as kbnsum --truer name-- in the test)
is never less accurate than kahansum. On long vectors with elements of similar magnitude,
the two approaches often give the same result. Whether its LLVM jitness alone or with my
hardware, kbnsum runs 4 times faster than kahansum on both small and large vectors.
Relative to sum, kahansum runs about 28:1 and kbnsum runs about 7:1.

I recommend using kbnsum for Julia until there is compelling reason to use a different
algorithm. Some of the alternate choices are best used for vectors longer than some n.
There are not that many alternatives, and, for me, part of getting comfortable with a new
programming language is porting or implementing better numerics. Given Julia's nature,
it is likely that effort will be covered. Kbnsum has the virtue of being straightforward.
The alternatives involve more lines of code. I have used them elsewhere, but have not
coded them Julia (yet). If I find something is notably better, you will hear about it.
Meanwhile, and perhaps for a long while, kbnsum will work for you when others test
against languages that use kahansum internally.

written carefully it is no more than 20% slower closes JuliaLang#199

ViralBShah · 2013-04-18T09:14:20Z

Now that we have optional arguments, perhaps sum and cumsum can have an option for KBN summation, and we can remove sum_kbn and cumsum_kbn.

JeffreySarnoff · 2013-04-18T10:31:18Z

If so,
with sum(..., kbn=false) the default,
one should be able to override the default and use kbn-summation
everywhere throughout a third party package on one day and use, say, the
package's default summation another day with a single override (without
requiring each call to sum, cumsum to be changed).

On Thu, Apr 18, 2013 at 5:14 AM, Viral B. Shah notifications@github.comwrote:

Now that we have optional arguments, perhaps sum and cumsum can have an
option for KBN summation, and we can remove sum_kbn and cumsum_kbn.

—
Reply to this email directly or view it on GitHubhttps://github.com//issues/199#issuecomment-16565856
.

JeffreySarnoff · 2013-04-18T10:56:07Z

e.g.

pkg.Require("CumsumAnalysis", kbn=true)
passes the package level option kbn=true to CumsumAnalysis
and CumsumAnalysis, if it require other packages, repasses that package level option by default

stevengj · 2013-08-14T03:59:16Z

PS. For interested parties on this thread, note that we now use pairwise summation (#4039), which is often surprisingly close to Kahan summation for large arrays, but without the performance penalty.

Don't export String since it is already exported by Base

JeffBezanson closed this as completed in 19ff52a Jul 6, 2012

kmsquire pushed a commit to kmsquire/julia that referenced this issue Jul 11, 2012

use K-B-N summation for float arrays, with thanks to @JeffreySarnoff

7767e51

written carefully it is no more than 20% slower closes JuliaLang#199

kmsquire mentioned this issue Sep 6, 2012

Updated cumsum to use K-B-N summation for float arrays. #1257

Merged

StefanKarpinski pushed a commit that referenced this issue Feb 8, 2018

Merge pull request #199 from JuliaLang/tk/dontexportstring

b00c9a8

Don't export String since it is already exported by Base

oschulz mentioned this issue Nov 8, 2019

Add unconjugated dot product dotu #27677

Open

helloinrm mentioned this issue May 20, 2020

EXCEPTION_ACCESS_VIOLATION when using LIBSVM.jl on large datasets #35954

Closed

Keno pushed a commit that referenced this issue Oct 9, 2023

ensure si doesn't step over anything (#199)

26a9a66

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

implement a better summation algorithm #199

implement a better summation algorithm #199

JeffBezanson commented Sep 19, 2011

StefanKarpinski commented Sep 20, 2011

JeffBezanson commented Sep 20, 2011

StefanKarpinski commented Mar 15, 2012

JeffreySarnoff commented Jul 5, 2012

JeffBezanson commented Jul 6, 2012

JeffreySarnoff commented Jul 6, 2012

ViralBShah commented Apr 18, 2013

JeffreySarnoff commented Apr 18, 2013

JeffreySarnoff commented Apr 18, 2013

stevengj commented Aug 14, 2013

implement a better summation algorithm #199

implement a better summation algorithm #199

Comments

JeffBezanson commented Sep 19, 2011

StefanKarpinski commented Sep 20, 2011

JeffBezanson commented Sep 20, 2011

StefanKarpinski commented Mar 15, 2012

JeffreySarnoff commented Jul 5, 2012

JeffBezanson commented Jul 6, 2012

JeffreySarnoff commented Jul 6, 2012

ViralBShah commented Apr 18, 2013

JeffreySarnoff commented Apr 18, 2013

JeffreySarnoff commented Apr 18, 2013

stevengj commented Aug 14, 2013