Fixed-point computation refers to the process of computing an exact or approximate fixed point of a given function.[1] In its most common form, the given function f {\displaystyle f} satisfies the condition to the Brouwer fixed-point theorem: that is, f {\displaystyle f} is continuous and maps the unit d-cube to itself. The Brouwer fixed-point theorem guarantees that f {\displaystyle f} has a fixed point, but the proof is not constructive. Various algorithms have been devised for computing an approximate fixed point. Such algorithms are used in economics for computing a market equilibrium, in game theory for computing a Nash equilibrium, and in dynamic system analysis.
The unit interval is denoted by E := [ 0 , 1 ] {\displaystyle E:=[0,1]} , and the unit d-dimensional cube is denoted by E d {\displaystyle E^{d}} . A continuous function f {\displaystyle f} is defined on E d {\displaystyle E^{d}} (from E d {\displaystyle E^{d}} to itself). Often, it is assumed that f {\displaystyle f} is not only continuous but also Lipschitz continuous, that is, for some constant L {\displaystyle L} , | f ( x ) − f ( y ) | ≤ L ⋅ | x − y | {\displaystyle |f(x)-f(y)|\leq L\cdot |x-y|} for all x , y {\displaystyle x,y} in E d {\displaystyle E^{d}} .
A fixed point of f {\displaystyle f} is a point x {\displaystyle x} in E d {\displaystyle E^{d}} such that f ( x ) = x {\displaystyle f(x)=x} . By the Brouwer fixed-point theorem, any continuous function from E d {\displaystyle E^{d}} to itself has a fixed point. But for general functions, it is impossible to compute a fixed point precisely, since it can be an arbitrary real number. Fixed-point computation algorithms look for approximate fixed points. There are several criteria for an approximate fixed point. Several common criteria are:[2]
For Lipschitz-continuous functions, the absolute criterion is stronger than the residual criterion: If f {\displaystyle f} is Lipschitz-continuous with constant L {\displaystyle L} , then | x − x 0 | ≤ δ {\displaystyle |x-x_{0}|\leq \delta } implies | f ( x ) − f ( x 0 ) | ≤ L ⋅ δ {\displaystyle |f(x)-f(x_{0})|\leq L\cdot \delta } . Since x 0 {\displaystyle x_{0}} is a fixed-point of f {\displaystyle f} , this implies | f ( x ) − x 0 | ≤ L ⋅ δ {\displaystyle |f(x)-x_{0}|\leq L\cdot \delta } , so | f ( x ) − x | ≤ ( 1 + L ) ⋅ δ {\displaystyle |f(x)-x|\leq (1+L)\cdot \delta } . Therefore, a δ-absolute fixed-point is also an ε-residual fixed-point with ε = ( 1 + L ) ⋅ δ {\displaystyle \varepsilon =(1+L)\cdot \delta } .
The most basic step of a fixed-point computation algorithm is a value query: given any x {\displaystyle x} in E d {\displaystyle E^{d}} , the algorithm is provided with an oracle f ~ {\displaystyle {\tilde {f}}} to f {\displaystyle f} that returns the value f ( x ) {\displaystyle f(x)} . The accuracy of the approximate fixed-point depends upon the error in the oracle f ~ ( x ) {\displaystyle {\tilde {f}}(x)} .
The function f {\displaystyle f} is accessible via evaluation queries: for any x {\displaystyle x} , the algorithm can evaluate f ( x ) {\displaystyle f(x)} . The run-time complexity of an algorithm is usually given by the number of required evaluations.
A Lipschitz-continuous function with constant L {\displaystyle L} is called contractive if L < 1 {\displaystyle L<1} ; it is called weakly-contractive if L ≤ 1 {\displaystyle L\leq 1} . Every contractive function satisfying Brouwer's conditions has a unique fixed point. Moreover, fixed-point computation for contractive functions is easier than for general functions.
The first algorithm for fixed-point computation was the fixed-point iteration algorithm of Banach. Banach's fixed-point theorem implies that, when fixed-point iteration is applied to a contraction mapping, the error after t {\displaystyle t} iterations is in O ( L t ) {\displaystyle O(L^{t})} . Therefore, the number of evaluations required for a δ {\displaystyle \delta } -relative fixed-point is approximately log L ( δ ) = log ( δ ) / log ( L ) = log ( 1 / δ ) / log ( 1 / L ) {\displaystyle \log _{L}(\delta )=\log(\delta )/\log(L)=\log(1/\delta )/\log(1/L)} . Sikorski and Wozniakowski[4] showed that Banach's algorithm is optimal when the dimension is large. Specifically, when d ≥ log ( 1 / δ ) / log ( 1 / L ) {\displaystyle d\geq \log(1/\delta )/\log(1/L)} , the number of required evaluations of any algorithm for δ {\displaystyle \delta } -relative fixed-point is larger than 50% the number of evaluations required by the iteration algorithm. Note that when L {\displaystyle L} approaches 1, the number of evaluations approaches infinity. No finite algorithm can compute a δ {\displaystyle \delta } -absolute fixed point for all functions with L = 1 {\displaystyle L=1} .[5]
When L {\displaystyle L} < 1 and d = 1, the optimal algorithm is the Fixed Point Envelope (FPE) algorithm of Sikorski and Wozniakowski.[4] It finds a δ-relative fixed point using O ( log ( 1 / δ ) + log log ( 1 / ( 1 − L ) ) ) {\displaystyle O(\log(1/\delta )+\log \log(1/(1-L)))} queries, and a δ-absolute fixed point using O ( log ( 1 / δ ) ) {\displaystyle O(\log(1/\delta ))} queries. This is faster than the fixed-point iteration algorithm.[6]
When d > 1 {\displaystyle d>1} but not too large, and L ≤ 1 {\displaystyle L\leq 1} , the optimal algorithm is the interior-ellipsoid algorithm (based on the ellipsoid method).[7] It finds an ε-residual fixed-point using O ( d ⋅ log ( 1 / ε ) ) {\displaystyle O(d\cdot \log(1/\varepsilon ))} evaluations. When L < 1 {\displaystyle L<1} , it finds a δ {\displaystyle \delta } -absolute fixed point using O ( d ⋅ [ log ( 1 / δ ) + log ( 1 / ( 1 − L ) ) ] ) {\displaystyle O(d\cdot [\log(1/\delta )+\log(1/(1-L))])} evaluations.
Shellman and Sikorski[8] presented an algorithm called BEFix (Bisection Envelope Fixed-point) for computing an ε-residual fixed-point of a two-dimensional function with ' L ≤ 1 {\displaystyle L\leq 1} , using only 2 ⌈ log 2 ( 1 / ε ) ⌉ + 1 {\displaystyle 2\lceil \log _{2}(1/\varepsilon )\rceil +1} queries. They later[9] presented an improvement called BEDFix (Bisection Envelope Deep-cut Fixed-point), with the same worst-case guarantee but better empirical performance. When L < 1 {\displaystyle L<1} , BEDFix can also compute a δ {\displaystyle \delta } -absolute fixed-point using O ( log ( 1 / ε ) + log ( 1 / ( 1 − L ) ) ) {\displaystyle O(\log(1/\varepsilon )+\log(1/(1-L)))} queries.
Shellman and Sikorski[2] presented an algorithm called PFix for computing an ε-residual fixed-point of a d-dimensional function with L ≤ 1, using O ( log d ( 1 / ε ) ) {\displaystyle O(\log ^{d}(1/\varepsilon ))} queries. When L {\displaystyle L} < 1, PFix can be executed with ε = ( 1 − L ) ⋅ δ {\displaystyle \varepsilon =(1-L)\cdot \delta } , and in that case, it computes a δ-absolute fixed-point, using O ( log d ( 1 / [ ( 1 − L ) δ ] ) ) {\displaystyle O(\log ^{d}(1/[(1-L)\delta ]))} queries. It is more efficient than the iteration algorithm when L {\displaystyle L} is close to 1. The algorithm is recursive: it handles a d-dimensional function by recursive calls on (d-1)-dimensional functions.
When the function f {\displaystyle f} is differentiable, and the algorithm can evaluate its derivative (not only f {\displaystyle f} itself), the Newton method can be used and it is much faster.[10][11]
For functions with Lipschitz constant L {\displaystyle L} > 1, computing a fixed-point is much harder.
For a 1-dimensional function (d = 1), a δ {\displaystyle \delta } -absolute fixed-point can be found using O ( log ( 1 / δ ) ) {\displaystyle O(\log(1/\delta ))} queries using the bisection method: start with the interval E := [ 0 , 1 ] {\displaystyle E:=[0,1]} ; at each iteration, let x {\displaystyle x} be the center of the current interval, and compute f ( x ) {\displaystyle f(x)} ; if f ( x ) > x {\displaystyle f(x)>x} then recurse on the sub-interval to the right of x {\displaystyle x} ; otherwise, recurse on the interval to the left of x {\displaystyle x} . Note that the current interval always contains a fixed point, so after O ( log ( 1 / δ ) ) {\displaystyle O(\log(1/\delta ))} queries, any point in the remaining interval is a δ {\displaystyle \delta } -absolute fixed-point of f {\displaystyle f} Setting δ := ε / ( L + 1 ) {\displaystyle \delta :=\varepsilon /(L+1)} , where L {\displaystyle L} is the Lipschitz constant, gives an ε-residual fixed-point, using O ( log ( L / ε ) = log ( L ) + log ( 1 / ε ) ) {\displaystyle O(\log(L/\varepsilon )=\log(L)+\log(1/\varepsilon ))} queries.[3]
For functions in two or more dimensions, the problem is much more challenging. Shellman and Sikorski[2] proved that for any integers d ≥ 2 and L {\displaystyle L} > 1, finding a δ-absolute fixed-point of d-dimensional L {\displaystyle L} -Lipschitz functions might require infinitely many evaluations. The proof idea is as follows. For any integer T > 1 and any sequence of T of evaluation queries (possibly adaptive), one can construct two functions that are Lipschitz-continuous with constant L {\displaystyle L} , and yield the same answer to all these queries, but one of them has a unique fixed-point at (x, 0) and the other has a unique fixed-point at (x, 1). Any algorithm using T evaluations cannot differentiate between these functions, so cannot find a δ-absolute fixed-point. This is true for any finite integer T.
Several algorithms based on function evaluations have been developed for finding an ε-residual fixed-point
In the worst case, the number of function evaluations required by all these algorithms is exponential in the binary representation of the accuracy, that is, in Ω ( 1 / ε ) {\displaystyle \Omega (1/\varepsilon )} .
Hirsch, Papadimitriou and Vavasis proved that[3] any algorithm based on function evaluations, that finds an ε-residual fixed-point of f, requires Ω ( L ′ / ε ) {\displaystyle \Omega (L'/\varepsilon )} function evaluations, where L ′ {\displaystyle L'} is the Lipschitz constant of the function f ( x ) − x {\displaystyle f(x)-x} (note that L − 1 ≤ L ′ ≤ L + 1 {\displaystyle L-1\leq L'\leq L+1} ). More precisely:
The latter result leaves a gap in the exponent. Chen and Deng[18] closed the gap. They proved that, for any d ≥ 2 and 1 / ε > 4 d {\displaystyle 1/\varepsilon >4d} and L ′ / ε > 192 d 3 {\displaystyle L'/\varepsilon >192d^{3}} , the number of queries required for computing an ε-residual fixed-point is in Θ ( ( L ′ / ε ) d − 1 ) {\displaystyle \Theta ((L'/\varepsilon )^{d-1})} .
A discrete function is a function defined on a subset of Z d {\displaystyle \mathbb {Z} ^{d}} (the d-dimensional integer grid). There are several discrete fixed-point theorems, stating conditions under which a discrete function has a fixed point. For example, the Iimura-Murota-Tamura theorem states that (in particular) if f {\displaystyle f} is a function from a rectangle subset of Z d {\displaystyle \mathbb {Z} ^{d}} to itself, and f {\displaystyle f} is hypercubic direction-preserving, then f {\displaystyle f} has a fixed point.
Let f {\displaystyle f} be a direction-preserving function from the integer cube { 1 , … , n } d {\displaystyle \{1,\dots ,n\}^{d}} to itself. Chen and Deng[18] prove that, for any d ≥ 2 and n > 48d, computing such a fixed point requires Θ ( n d − 1 ) {\displaystyle \Theta (n^{d-1})} function evaluations.
Chen and Deng[19] define a different discrete-fixed-point problem, which they call 2D-BROUWER. It considers a discrete function f {\displaystyle f} on { 0 , … , n } 2 {\displaystyle \{0,\dots ,n\}^{2}} such that, for every x on the grid, f {\displaystyle f} (x) - x is either (0, 1) or (1, 0) or (-1, -1). The goal is to find a square in the grid, in which all three labels occur. The function f {\displaystyle f} must map the square { 0 , … , n } 2 {\displaystyle \{0,\dots ,n\}^{2}} to itself, so it must map the lines x = 0 and y = 0 to either (0, 1) or (1, 0); the line x = n to either (-1, -1) or (0, 1); and the line y = n to either (-1, -1) or (1,0). The problem can be reduced to 2D-SPERNER (computing a fully-labeled triangle in a triangulation satisfying the conditions to Sperner's lemma), and therefore it is PPAD-complete. This implies that computing an approximate fixed-point is PPAD-complete even for very simple functions.
Given a function g {\displaystyle g} from E d {\displaystyle E^{d}} to R, a root of g {\displaystyle g} is a point x in E d {\displaystyle E^{d}} such that g {\displaystyle g} (x)=0. An ε-root of g is a point x in E d {\displaystyle E^{d}} such that g ( x ) ≤ ε {\displaystyle g(x)\leq \varepsilon } .
Fixed-point computation is a special case of root-finding: given a function f {\displaystyle f} on E d {\displaystyle E^{d}} , define g ( x ) := | f ( x ) − x | {\displaystyle g(x):=|f(x)-x|} . X is a fixed-point of f {\displaystyle f} if and only if x is a root of g {\displaystyle g} , and x is an ε-residual fixed-point of f {\displaystyle f} if and only if x is an ε-root of g {\displaystyle g} . Therefore, any root-finding algorithm (an algorithm that computes an approximate root of a function) can be used to find an approximate fixed-point.
The opposite is not true: finding an approximate root of a general function may be harder than finding an approximate fixed point. In particular, Sikorski[20] proved that finding an ε-root requires Ω ( 1 / ε d ) {\displaystyle \Omega (1/\varepsilon ^{d})} function evaluations. This gives an exponential lower bound even for a one-dimensional function (in contrast, an ε-residual fixed-point of a one-dimensional function can be found using O ( log ( 1 / ε ) ) {\displaystyle O(\log(1/\varepsilon ))} queries using the bisection method). Here is a proof sketch.[3]: 35 Construct a function g {\displaystyle g} that is slightly larger than ε everywhere in E d {\displaystyle E^{d}} except in some small cube around some point x0, where x0 is the unique root of g {\displaystyle g} . If g {\displaystyle g} is Lipschitz continuous with constant L {\displaystyle L} , then the cube around x0 can have a side-length of ε / L {\displaystyle \varepsilon /L} . Any algorithm that finds an ε-root of g {\displaystyle g} must check a set of cubes that covers the entire E d {\displaystyle E^{d}} ; the number of such cubes is at least ( L / ε ) d {\displaystyle (L/\varepsilon )^{d}} .
However, there are classes of functions for which finding an approximate root is equivalent to finding an approximate fixed point. One example[18] is the class of functions g {\displaystyle g} such that g ( x ) + x {\displaystyle g(x)+x} maps E d {\displaystyle E^{d}} to itself (that is: g ( x ) + x {\displaystyle g(x)+x} is in E d {\displaystyle E^{d}} for all x in E d {\displaystyle E^{d}} ). This is because, for every such function, the function f ( x ) := g ( x ) + x {\displaystyle f(x):=g(x)+x} satisfies the conditions of Brouwer's fixed-point theorem. X is a fixed-point of f {\displaystyle f} if and only if x is a root of g {\displaystyle g} , and x is an ε-residual fixed-point of f {\displaystyle f} if and only if x is an ε-root of g {\displaystyle g} . Chen and Deng[18] show that the discrete variants of these problems are computationally equivalent: both problems require Θ ( n d − 1 ) {\displaystyle \Theta (n^{d-1})} function evaluations.
Roughgarden and Weinstein[21] studied the communication complexity of computing an approximate fixed-point. In their model, there are two agents: one of them knows a function f {\displaystyle f} and the other knows a function g {\displaystyle g} . Both functions are Lipschitz continuous and satisfy Brouwer's conditions. The goal is to compute an approximate fixed point of the composite function g ∘ f {\displaystyle g\circ f} . They show that the deterministic communication complexity is in Ω ( 2 d ) {\displaystyle \Omega (2^{d})} .