Awesome
Richardson package for Julia
The Richardson
package provides a function extrapolate
that
extrapolates any function f(x)
to f(x0)
, evaluating
f
only at a geometric sequence of points > x0
(or optionally < x0
) or at a given sequence of points. f(x)
can
return scalars, vectors, or any type implementing a normed vector space.
The key algorithm is Richardson extrapolation using a Neville–Aitken
tableau, which adaptively increases the degree of an extrapolation
polynomial until convergence is achieved to a desired tolerance
(or convergence stalls due to e.g. floating-point errors). This
allows one to obtain f(x0)
to high-order accuracy, assuming
that f(x0+h)
has a Taylor series or some other power
series in h
. (See e.g. these course notes by Prof. Flaherty at RPI.)
Usage
extrapolate(f, h; contract=0.125, x0=zero(h), power=1,
atol=0, rtol=atol>0 ? 0 : sqrt(ε), maxeval=typemax(Int), breaktol=2)
Extrapolate f(x)
to f₀ ≈ f(x0)
, evaluating f
only at x > x0
points
(or x < x0
if h < 0
) using Richardson extrapolation starting at
x=x₀+h
. It returns a tuple (f₀, err)
of the estimated f(x0)
and an error estimate.
The return value of f
can be any type supporting ±
and norm
operations (i.e. a normed vector space).
Similarly, h
and x0
can be in any normed vector space,
in which case extrapolate
performs Richardson extrapolation
of f(x0+s*h)
to s=0⁺
(i.e. it takes the limit as x
goes
to x0
along the h
direction).
On each step of Richardson extrapolation, it shrinks x-x0
by
a factor of contract
, stopping when the estimated error is
< max(rtol*norm(f₀), atol)
, when the estimated error
increases by more than breaktol
(e.g. due to numerical errors in the
computation of f
), when f
returns a non-finite value (NaN
or Inf
),
or when f
has been evaluated maxeval
times. Note that
if the function may converge to zero, you may want
specify a nonzero atol
(which cannot be set by default
because it depends on the scale/units of f
); alternatively,
in such cases extrapolate
will halt when it becomes
limited by the floating-point precision. (Passing breaktol=Inf
can be useful to force extrapolate
to continue shrinking h
even
if polynomial extrapolation is initially failing to converge,
possibly at the cost of extraneous function evaluations.)
If x0 = ±∞
(±Inf
), then extrapolate
computes the limit of
f(x)
as x ⟶ ±∞
using geometrically increasing values
of h
(by factors of 1/contract
).
In general, the starting h
should be large enough that f(x0+h)
can be computed accurately and efficiently (e.g. without
severe cancellation errors), but small enough that f
does not
oscillate much between x0
and x0+h
. i.e. h
should be a typical
scale over which the function f
varies significantly.
Technically, Richardson extrapolation assumes that f(x0+h)
can
be expanded in a power series in h^power
, where the default
power=1
corresponds to an ordinary Taylor series (i.e. assuming
f
is analytic at x0
). If this is not true, you may obtain
slow convergence from extrapolate
, but you can pass a different
value of power
(e.g. power=0.5
) if your f
has some different
(Puiseux) power-series expansion. Conversely, if f
is
an even function around x0
, i.e. f(x0+h) == f(x0-h)
,
so that its Taylor series contains only even powers of h
,
you can accelerate convergence by passing power=2
.
extrapolate(fh_itr; power=1, atol=0, rtol=0, maxeval=typemax(Int), breaktol=Inf)
Similar to extrapolate(f, h)
, performs Richardson extrapolation of a sequence of
values f(h)
to h → 0
, but takes an iterable collection fh_itr
of a
sequence of (f(h), h)
tuples (in order of decreasing |h|
).
There is no contract
keyword argument since the contraction factors are determined
by the sequence of h
values (which need not contract by the same amount). The
tolerances atol
and rtol
both default to 0
so that by default it examines
all of the values in the fh_itr
collection. Otherwise, the keyword arguments
have the same meanings as in extrapolate(f, h)
.
extrapolate!(fh::AbstractVector; power=1, atol=0, rtol=0, maxeval=typemax(Int), breaktol=Inf)
Similar to extrapolate(fh_itr)
, performs Richardson extrapolation on an array fh
of (f(h), h)
tuples (in order of decreasing |h|
), but overwrites the array
fh
in-place with intermediate calculations.
(Thus, the array fh
must be a vector of Tuple{T,H}
values, where H<:Number
is
the type of h
and T
is the type of the extrapolated f(0)
result. This T
should be a floating-point type, i.e. fh
should contain float(f(h))
if the
function you are extrapolating is not already floating-point-valued.)
Examples
For example, let's extrapolate sin(x)/x
to x=0
(where the correct answer is 1
) starting at x=1
, printing out the x
value at each step so that we can see what the algorithm is doing.
(Since f
is passed as the first argument to extrapolate
, we
can use Julia's do
syntax to conveniently define a multi-line
anonymous function to pass.)
extrapolate(1.0, rtol=1e-10) do x
@show x
sin(x)/x
end
giving the output:
x = 1.0
x = 0.125
x = 0.015625
x = 0.001953125
x = 0.000244140625
x = 3.0517578125e-5
(1.0000000000000002, 2.0838886172214188e-13)
That is, it evaluates our function sin(x)/x
for 6 different values of x
and returns 1.0000000000000002
, which is accurate to machine precision (the error is ≈ 2.2e-16
). The returned error estimate of 2e-13
is conservative, which is typical for extrapolating well-behaved functions.
Since sin(x)/x
is an even (symmetric) function around x=0
,
its Taylor series contains only even powers of x
. We can
exploit this fact to accelerate convergence for even functions by
passing power=2
to extrapolate
:
extrapolate(1.0, rtol=1e-10, power=2) do x
@show x
sin(x)/x
end
gives
x = 1.0
x = 0.125
x = 0.015625
x = 0.001953125
x = 0.000244140625
(1.0, 0.0)
which converged to machine precision (in fact, the exact result) in only 5 function evaluations (1 fewer than above).
Infinite limits
Using the x0
keyword argument, we can compute the limit of f(x)
as x ⟶ x0
. In fact, you can pass x0 = Inf
to compute a limit as
x ⟶ ∞
(which is accomplished internally by a change of variables x = 1/u
and performing Richardson extrapolation to u=0
). For example:
extrapolate(1.0, x0=Inf) do x
@show x
(x^2 + 3x - 2) / (x^2 + 5)
end
gives
x = 1.0
x = 8.0
x = 64.0
x = 512.0
x = 4096.0
x = 32768.0
x = 262144.0
(1.0000000000000002, 1.2938539128981574e-12)
which is the correct result (1.0
) to machine precision.
Extrapolating series
One nice application of infinite limits is extrapolating infinite series. If we start with an integer x
, then the default contract=0.125
will increase x
by a factor of 8.0
on each iteration, so x
will always be an exact integer and we can use it as the number of terms in a series.
For example, suppose we are computing the infinite series 1/1² + 1/2² + 1/3² + ⋯
. This is the famous Basel problem, and it converges to π²/6 ≈ 1.644934066848226…
. If we compute it by brute force, however, we need quite a few terms to get high accuracy:
julia> sum(n -> 1/n^2, 1:100) - π^2/6
-0.009950166663333482
julia> sum(n -> 1/n^2, 1:10^4) - π^2/6
-9.99950001654426e-5
julia> sum(n -> 1/n^2, 1:10^9) - π^2/6
-9.999985284281365e-10
Even with 10⁹ terms we get only about 9 digits. Instead, we can use extrapolate
(starting at 1 term):
julia> val, err = extrapolate(1, x0=Inf) do N
@show N
sum(n -> 1/n^2, 1:Int(N))
end
N = 1.0
N = 8.0
N = 64.0
N = 512.0
N = 4096.0
N = 32768.0
(1.6449340668482288, 4.384936858059518e-12)
julia> (val - π^2/6)/(π^2/6)
1.4848562646983628e-15
By 32768
terms, the extrapolated value is accurate to about 15 digits.
Numerical derivatives
A classic application of Richardson extrapolation is the accurate evaluation of derivatives via finite-difference approximations (although analytical derivatives, e.g. by automatic differentiation, are typically vastly more efficient when they are available). In this example, we use Richardson extrapolation on the forward-difference approximation f'(x) ≈ (f(x+h)-f(x))/h
, for which the error decreases as O(h)
but a naive application to a very small h
will yield a huge cancellation error from floating-point roundoff effects. We differentiate f(x)=sin(x)
at x=1
, for which the correct answer is cos(1) ≈ 0.5403023058681397174009366...
, starting with h=0.1
extrapolate(0.1, rtol=0) do h
@show h
(sin(1+h) - sin(1)) / h
end
Although we gave an rtol
of 0
, the extrapolate
function will terminate after a finite number of steps when it detects that improvements are limited by floating-point error:
h = 0.1
h = 0.0125
h = 0.0015625
h = 0.0001953125
h = 2.44140625e-5
h = 3.0517578125e-6
(0.5403023058683176, 1.7075230118734908e-12)
The output 0.5403023058683176
differs from cos(1)
by ≈ 1.779e-13
, so in this case the returned error estimate is only a little conservative. Unlike the sin(x)/x
example, extrapolate
is not able
to attain machine precision (the floating-point cancellation error in this function is quite severe for small h
!), but it is able to get surprisingly close.
Another possibility for a finite-difference/Richardson combination was suggested by Ridders (1982), who computed both f'(x)
and f''(x)
(the first and second derivatives) simultaneously using a center-difference approximation, which requires two new f(x)
evaluations for each h
. In particular, the center-difference approximations are f'(x) ≈ (f(x+h)-f(x-h))/2h
and f''(x) ≈ (f(x+h)-2f(x)+f(x-h))/h²
, both of which have errors that go as O(h²)
. We can plug both of these functions simultaneously into extrapolate
(so that they share f(x±h)
evaluations) by using a vector-valued function returning [f', f'']
. Moreover, since these center-difference approximations are even functions of h
(identical for ±h
), we can pass power=2
to extrapolate
in order to exploit the even-power Taylor expansion. Here is a function implementing both of these ideas:
# returns (f'(x), f''(x))
function riddersderiv2(f, x, h; atol=0, rtol=atol>0 ? 0 : sqrt(eps(typeof(float(real(x+h))))), contract=0.5)
f₀ = f(x)
val, err = extrapolate(h, atol=atol, rtol=rtol, contract=contract, power=2) do h
f₊, f₋ = f(x+h), f(x-h)
[(f₊-f₋)/2h, (f₊-2f₀+f₋)/h^2]
end
return val[1], val[2]
end
(This code could be made even more efficient by using StaticArrays.jl for the [f', f'']
vector.) The original paper by Ridders accomplishes something similar in < 20
lines of TI-59 calculator code, by the way; so much for high-level languages!
For example,
julia> riddersderiv2(1, 0.1, rtol=0) do x
@show x
sin(x)
end
x = 1
x = 1.1
x = 0.9
x = 1.05
x = 0.95
x = 1.025
x = 0.975
x = 1.0125
x = 0.9875
x = 1.00625
x = 0.99375
(0.5403023058681394, -0.841470984807975)
evaluates the first and second derivatives of sin(x)
at x=1
and obtains the correct answer (cos(1), -sin(1))
to about 15 and 13 decimal digits, respectively, using 11 function evaluations.
Handling problematic convergence
It is useful to consider a finite-difference approximation for the derivative of
the function 1/x
at some x ≠ 0
: i.e. computing the limit of f(h) = (1/(x+h) - 1/x) / h
as h
goes to zero similar to above.
This function f(h)
has a pole at h=-x
, i.e. f(-x)
blows up. This means
that the Taylor series of f(h)
only converges for h
values small enough to avoid this pole, and in
fact for |h| < |x|
. Since Richardson extrapolation is essentially
approximating the Taylor series, this means that the extrapolation process doesn't
converge if the starting h
is too large, and extrapolation
will give up and
halt with the wrong answer.
This lack of convergence is easily observed: set x=0.01
(where the correct derivative of 1/x
is -10000
) and consider what happens for a starting h
that is too large compared to |x|
:
julia> extrapolate(1.0) do h
@show h
x = 0.01
(1/(x+h) - 1/x) / h
end
h = 1.0
h = 0.125
h = 0.015625
(-832.4165749908325, 733.4066740007335)
Before reaching an |h| < 0.01
where the power series could begin to converge, extrapolate
gave up and returned a wrong answer (with a large error estimate to let you know that the result is garbage)! In contrast, if we start with a small enough h
then it converges just fine and returns the correct answer (-10000
) to nearly machine precision:
julia> extrapolate(0.01) do h
@show h
x = 0.01
(1/(x+h) - 1/x) / h
end
h = 0.01
h = 0.00125
h = 0.00015625
h = 1.953125e-5
h = 2.44140625e-6
h = 3.0517578125e-7
(-10000.000000000211, 4.066770998178981e-6)
Of course, if you know that your function blows up like this, it is easy to choose good initial h
, but how can we persuade extrapolate
to do a better job automatically?
The trick is to use the breaktol
keyword argument. breaktol
defaults to 2
,
which means that extrapolate
gives up if the best error estimate increases by
more than a factor of 2 from one iteration to the next. Ordinarily, this
kind of breakdown in convergence arises because you hit the limits of floating-point precision, and halting extrapolation is the right thing to do. Here, however, it would converge if we just continued shrinking h
. So, we simply set breaktol=Inf
to force extrapolation to continue, which works even for a large initial h=1000.0
:
julia> extrapolate(1000.0, breaktol=Inf) do h
@show h
x = 0.01
(1/(x+h) - 1/x) / h
end
h = 1000.0
h = 125.0
h = 15.625
h = 1.953125
h = 0.244140625
h = 0.030517578125
h = 0.003814697265625
h = 0.000476837158203125
h = 5.9604644775390625e-5
h = 7.450580596923828e-6
h = 9.313225746154785e-7
h = 1.1641532182693481e-7
(-10000.000000029328, 5.8616933529265225e-8)
Not that it continues extrapolating until it reaches small h
values where the power-series converges, and in the end it again returns the correct answer to nearly machine precision (and would reach machine precision if we set a smaller rtol
). (The extrapolate
function automatically discards the initial points where the polynomial extrapolation fails.)