Shanks transformation

In numerical analysis, the Shanks transformation is a non-linear series acceleration method to increase the rate of convergence of a sequence. This method is named after Daniel Shanks, who rediscovered this sequence transformation in 1955. It was first derived and published by R. Schmidt in 1941.^[1]

One can calculate only a few terms of a perturbation expansion, usually no more than two or three, and almost never more than seven. The resulting series is often slowly convergent, or even divergent. Yet those few terms contain a remarkable amount of information, which the investigator should do his best to extract.
This viewpoint has been persuasively set forth in a delightful paper by Shanks (1955), who displays a number of amazing examples, including several from fluid mechanics.

Milton D. Van Dyke (1975) Perturbation methods in fluid mechanics, p. 202.

Formulation

For a sequence $\left\{a_{m}\right\}_{m\in \mathbb {N} }$ the series

A=\sum _{m=0}^{\infty }a_{m}\,

is to be determined. First, the partial sum $A_{n}$ is defined as:

A_{n}=\sum _{m=0}^{n}a_{m}\,

and forms a new sequence $\left\{A_{n}\right\}_{n\in \mathbb {N} }$ . Provided the series converges, $A_{n}$ will also approach the limit $A$ as $n\to \infty .$ The Shanks transformation $S(A_{n})$ of the sequence $A_{n}$ is the new sequence defined by^[2]^[3]

S(A_{n})={\frac {A_{n+1}\,A_{n-1}\,-\,A_{n}^{2}}{A_{n+1}-2A_{n}+A_{n-1}}}=A_{n+1}-{\frac {(A_{n+1}-A_{n})^{2}}{(A_{n+1}-A_{n})-(A_{n}-A_{n-1})}}

where this sequence $S(A_{n})$ often converges more rapidly than the sequence $A_{n}.$ Further speed-up may be obtained by repeated use of the Shanks transformation, by computing $S^{2}(A_{n})=S(S(A_{n})),$ $S^{3}(A_{n})=S(S(S(A_{n}))),$ etc.

Note that the non-linear transformation as used in the Shanks transformation is essentially the same as used in Aitken's delta-squared process so that as with Aitken's method, the right-most expression in $S(A_{n})$ 's definition (i.e. $S(A_{n})=A_{n+1}-{\frac {(A_{n+1}-A_{n})^{2}}{(A_{n+1}-A_{n})-(A_{n}-A_{n-1})}}$ ) is more numerically stable than the expression to its left (i.e. $S(A_{n})={\frac {A_{n+1}\,A_{n-1}\,-\,A_{n}^{2}}{A_{n+1}-2A_{n}+A_{n-1}}}$ ). Both Aitken's method and the Shanks transformation operate on a sequence, but the sequence the Shanks transformation operates on is usually thought of as being a sequence of partial sums, although any sequence may be viewed as a sequence of partial sums.

Example

As an example, consider the slowly convergent series^[3]

4\sum _{k=0}^{\infty }(-1)^{k}{\frac {1}{2k+1}}=4\left(1-{\frac {1}{3}}+{\frac {1}{5}}-{\frac {1}{7}}+\cdots \right)

which has the exact sum π ≈ 3.14159265. The partial sum $A_{6}$ has only one digit accuracy, while six-figure accuracy requires summing about 400,000 terms.

In the table below, the partial sums $A_{n}$ , the Shanks transformation $S(A_{n})$ on them, as well as the repeated Shanks transformations $S^{2}(A_{n})$ and $S^{3}(A_{n})$ are given for $n$ up to 12. The figure to the right shows the absolute error for the partial sums and Shanks transformation results, clearly showing the improved accuracy and convergence rate.

$n$	$A_{n}$	$S(A_{n})$	$S^{2}(A_{n})$	$S^{3}(A_{n})$
0	4.00000000	—	—	—
1	2.66666667	3.16666667	—	—
2	3.46666667	3.13333333	3.14210526	—
3	2.89523810	3.14523810	3.14145022	3.14159936
4	3.33968254	3.13968254	3.14164332	3.14159086
5	2.97604618	3.14271284	3.14157129	3.14159323
6	3.28373848	3.14088134	3.14160284	3.14159244
7	3.01707182	3.14207182	3.14158732	3.14159274
8	3.25236593	3.14125482	3.14159566	3.14159261
9	3.04183962	3.14183962	3.14159086	3.14159267
10	3.23231581	3.14140672	3.14159377	3.14159264
11	3.05840277	3.14173610	3.14159192	3.14159266
12	3.21840277	3.14147969	3.14159314	3.14159265

The Shanks transformation $S(A_{1})$ already has two-digit accuracy, while the original partial sums only establish the same accuracy at $A_{24}.$ Remarkably, $S^{3}(A_{3})$ has six digits accuracy, obtained from repeated Shank transformations applied to the first seven terms $A_{0},\ldots ,A_{6}.$ As mentioned before, $A_{n}$ only obtains 6-digit accuracy after summing about 400,000 terms.

Motivation

The Shanks transformation is motivated by the observation that — for larger $n$ — the partial sum $A_{n}$ quite often behaves approximately as^[2]

A_{n}=A+\alpha q^{n},\,

with $|q|<1$ so that the sequence converges transiently to the series result $A$ for $n\to \infty .$ So for $n-1,$ $n$ and $n+1$ the respective partial sums are:

A_{n-1}=A+\alpha q^{n-1}\quad ,\qquad A_{n}=A+\alpha q^{n}\qquad {\text{and}}\qquad A_{n+1}=A+\alpha q^{n+1}.

These three equations contain three unknowns: $A,$ $\alpha$ and $q.$ Solving for $A$ gives^[2]

A={\frac {A_{n+1}\,A_{n-1}\,-\,A_{n}^{2}}{A_{n+1}-2A_{n}+A_{n-1}}}.

In the (exceptional) case that the denominator is equal to zero: then $A_{n}=A$ for all $n.$

Generalized Shanks transformation

The generalized kth-order Shanks transformation is given as the ratio of the determinants:^[4]

S_{k}(A_{n})={\frac {\begin{vmatrix}A_{n-k}&\cdots &A_{n-1}&A_{n}\\\Delta A_{n-k}&\cdots &\Delta A_{n-1}&\Delta A_{n}\\\Delta A_{n-k+1}&\cdots &\Delta A_{n}&\Delta A_{n+1}\\\vdots &&\vdots &\vdots \\\Delta A_{n-1}&\cdots &\Delta A_{n+k-2}&\Delta A_{n+k-1}\\\end{vmatrix}}{\begin{vmatrix}1&\cdots &1&1\\\Delta A_{n-k}&\cdots &\Delta A_{n-1}&\Delta A_{n}\\\Delta A_{n-k+1}&\cdots &\Delta A_{n}&\Delta A_{n+1}\\\vdots &&\vdots &\vdots \\\Delta A_{n-1}&\cdots &\Delta A_{n+k-2}&\Delta A_{n+k-1}\\\end{vmatrix}}},

with $\Delta A_{p}=A_{p+1}-A_{p}.$ It is the solution of a model for the convergence behaviour of the partial sums $A_{n}$ with $k$ distinct transients:

A_{n}=A+\sum _{p=1}^{k}\alpha _{p}q_{p}^{n}.

This model for the convergence behaviour contains $2k+1$ unknowns. By evaluating the above equation at the elements $A_{n-k},A_{n-k+1},\ldots ,A_{n+k}$ and solving for $A,$ the above expression for the kth-order Shanks transformation is obtained. The first-order generalized Shanks transformation is equal to the ordinary Shanks transformation: $S_{1}(A_{n})=S(A_{n}).$

The generalized Shanks transformation is closely related to Padé approximants and Padé tables.^[4]

Note: The calculation of determinants requires many arithmetic operations to make, however Peter Wynn discovered a recursive evaluation procedure called epsilon-algorithm which avoids calculating the determinants.^[5]^[6]

Notes

^ Weniger (2003).
^ ^a ^b ^c Bender & Orszag (1999), pp. 368–375.
^ ^a ^b Van Dyke (1975), pp. 202–205.
^ ^a ^b Bender & Orszag (1999), pp. 389–392.
^ Wynn (1956)
^ Wynn (1962)

References

Shanks, D. (1955), "Non-linear transformation of divergent and slowly convergent sequences", Journal of Mathematics and Physics, 34 (1–4): 1–42, doi:10.1002/sapm19553411
Schmidt, R.J. (1941), "On the numerical solution of linear simultaneous equations by an iterative method", Philosophical Magazine, 32 (214): 369–383, doi:10.1080/14786444108520797
Van Dyke, M.D. (1975), Perturbation methods in fluid mechanics (annotated ed.), Parabolic Press, ISBN 0-915760-01-0
Bender, C.M.; Orszag, S.A. (1999), Advanced mathematical methods for scientists and engineers, Springer, ISBN 0-387-98931-5
Weniger, E.J. (1989). "Nonlinear sequence transformations for the acceleration of convergence and the summation of divergent series". Computer Physics Reports. 10 (5–6): 189–371. arXiv:math.NA/0306302. Bibcode:1989CoPhR..10..189W. doi:10.1016/0167-7977(89)90011-7.
Brezinski, C.; Redivo-Zaglia, M.; Saad, Y. (2018), "Shanks sequence transformations and Anderson acceleration", SIAM Review, 60 (3): 646–669, doi:10.1137/17M1120725, hdl:11577/3270110
Senhadji, M.N. (2001), "On condition numbers of the Shanks transformation", J. Comput. Appl. Math., 135 (1): 41–61, Bibcode:2001JCoAM.135...41S, doi:10.1016/S0377-0427(00)00561-6
Wynn, P. (1956), "On a device for computing the e_m(S_n) transformation", Mathematical Tables and Other Aids to Computation, 10 (54): 91–96, doi:10.2307/2002183, JSTOR 2002183
Wynn, P. (1962), "Acceleration techniques for iterated vector and matrix problems", Math. Comp., 16 (79): 301–322, doi:10.1090/S0025-5718-1962-0145647-X

[1] Weniger (2003).

[BenderOrszag368-2] Bender & Orszag (1999), pp. 368–375.

[VanDyke-3] Van Dyke (1975), pp. 202–205.

[BenderOrszag389-4] Bender & Orszag (1999), pp. 389–392.

[5] Wynn (1956)

[6] Wynn (1962)

[1]

[2]

[3]

[4]

[5]

[6]