Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

implement a better summation algorithm #199

Closed
JeffBezanson opened this issue Sep 19, 2011 · 10 comments
Closed

implement a better summation algorithm #199

JeffBezanson opened this issue Sep 19, 2011 · 10 comments
Labels
help wanted Indicates that a maintainer wants help on an issue or pull request

Comments

@JeffBezanson
Copy link
Member

sum should use a better algorithm, or at least we should provide an alternative function that does a better job. Candidates include Kahan summation (http://en.wikipedia.org/wiki/Kahan_summation_algorithm) and pairwise summation.

@StefanKarpinski
Copy link
Member

By pairwise summation, I assume you mean recursive pairwise, as in this sort of thing:

sum(x::Vector) = length(x) == 0 ? 0 :
                 length(x) == 1 ? x[1] :
                 sum(x[1:div(end-1,2)]) + sum(x[div(end+1,2):end])

@JeffBezanson
Copy link
Member Author

Kahan summation looks promising since it looks like it can be done with just a couple extra arithmetic ops on values already in registers.

@StefanKarpinski
Copy link
Member

Maybe a keyword option for this: alg="kahan". We could also implement recursive and sorted summation algorithms.

@JeffreySarnoff
Copy link
Contributor

sorry about that -- this is the part that matters

# bettersum.jl
#
# bettersum(Vector{Float64}) is more accurate and faster than kahansum()
#
# Jeffrey Sarnoff on 2012-Jul-05



# Kahan's compensated summation
# W. Kahan.
# Further remarks on reducing truncation erros.
# Comm. ACM, 8:40, 1965


function kahansum(x)
    n = length(x)
    if (n==0)  return(0)  end

    s = x[1]
    c = 0
    for i in 2:n
      y = x[i] - c
      t = s + y
      c = (t - s) -y
      s = t
    end
    s
end    


# Kahan and Babuska summation, Neumaier variant
# A. Neumaier.
# Rundungsfehleranalyse einiger Verfahren zur Summation endlicher Summen.
# Math. Mechanik, 54:39–51, 1974.

function bettersum(x)
    n = length(x)
    if (n == 0)   return(0)  end

    s = x[1]
    c = 0
    for i in 2:n
        t = s + x[i]
        if ( abs(s) >= abs(x[i]) )
           c += ( (s-t) + x[i] )
        else
           c += ( (x[i]-t) + s )
        end
        s = t
    end

    s + c
end


# test vector is Tim Peters'
# truesum( vec ) == 2_000.0

vec =  [1,1e100,1,-1e100]*1000

sum(vec)      == 0.0
kahansum(vec) == 0.0
kbnsum(vec)   == 2_000.0

[pao: syntax highlights]

@JeffBezanson
Copy link
Member Author

Is kbnsum always better? Maybe we should use this by default for float arrays.

@JeffreySarnoff
Copy link
Contributor

kbnsum (above implemented as bettersum, written as kbnsum --truer name-- in the test)
is never less accurate than kahansum. On long vectors with elements of similar magnitude,
the two approaches often give the same result. Whether its LLVM jitness alone or with my
hardware, kbnsum runs 4 times faster than kahansum on both small and large vectors.
Relative to sum, kahansum runs about 28:1 and kbnsum runs about 7:1.

I recommend using kbnsum for Julia until there is compelling reason to use a different
algorithm. Some of the alternate choices are best used for vectors longer than some n.
There are not that many alternatives, and, for me, part of getting comfortable with a new
programming language is porting or implementing better numerics. Given Julia's nature,
it is likely that effort will be covered. Kbnsum has the virtue of being straightforward.
The alternatives involve more lines of code. I have used them elsewhere, but have not
coded them Julia (yet). If I find something is notably better, you will hear about it.
Meanwhile, and perhaps for a long while, kbnsum will work for you when others test
against languages that use kahansum internally.

kmsquire pushed a commit to kmsquire/julia that referenced this issue Jul 11, 2012
written carefully it is no more than 20% slower
closes JuliaLang#199
@ViralBShah
Copy link
Member

Now that we have optional arguments, perhaps sum and cumsum can have an option for KBN summation, and we can remove sum_kbn and cumsum_kbn.

@JeffreySarnoff
Copy link
Contributor

If so,​
with sum(..., kbn=false) the default,
​one​ should be able to override the default and use kbn-summation
everywhere throughout a third party package on one day and use, say, the
package's default summation another day with a single override (without
requiring each call to sum, cumsum to be changed).

On Thu, Apr 18, 2013 at 5:14 AM, Viral B. Shah notifications@github.comwrote:

Now that we have optional arguments, perhaps sum and cumsum can have an
option for KBN summation, and we can remove sum_kbn and cumsum_kbn.


Reply to this email directly or view it on GitHubhttps://github.com//issues/199#issuecomment-16565856
.

@JeffreySarnoff
Copy link
Contributor

e.g.

pkg.Require("CumsumAnalysis", kbn=true)
passes the package level option kbn=true to CumsumAnalysis
and CumsumAnalysis, if it require other packages, repasses that package level option by default

@stevengj
Copy link
Member

PS. For interested parties on this thread, note that we now use pairwise summation (#4039), which is often surprisingly close to Kahan summation for large arrays, but without the performance penalty.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Indicates that a maintainer wants help on an issue or pull request
Projects
None yet
Development

No branches or pull requests

5 participants