Skip to content

PERF/COMPAT: use kahan summation in .rolling(..).sum() #13254

Closed
@jreback

Description

@jreback

from mailing list

to avoid imprecision errors as the rolling computations are evaluated marginally (sliding the window and adding new / subtracting old). We tend to accumulate floating point errors.

I looks like just an extra sum and subtract so I think the perf impact would be minimal.

This might be able to be applied to .rolling(..).mean() as well. Note that .rolling(..).std/var already use Welford's algo for better precision.

https://en.wikipedia.org/wiki/Kahan_summation_algorithm

implementation from numpy

cdef double kahan_sum(double *darr, npy_intp n):
    cdef double c, y, t, sum
    cdef npy_intp i
    sum = darr[0]
    c = 0.0
    for i from 1 <= i < n:
        y = darr[i] - c
        t = sum + y
        c = (t-sum) - y
        sum = t
    return sum

Metadata

Metadata

Assignees

Labels

AlgosNon-arithmetic algos: value_counts, factorize, sorting, isin, clip, shift, diffNumeric OperationsArithmetic, Comparison, and Logical operationsPerformanceMemory or execution speed performanceWindowrolling, ewma, expanding

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions