A separation oracle (also called a cutting-plane oracle) is a concept in the mathematical theory of convex optimization. It is a method to describe a convex set that is given as an input to an optimization algorithm. Separation oracles are used as input to ellipsoid methods.[1]: 87, 96, 98
Let K be a convex and compact set in Rn. A strong separation oracle for K is an oracle (black box) that, given a vector y in Rn, returns one of the following:[1]: 48
A strong separation oracle is completely accurate, and thus may be hard to construct. For practical reasons, a weaker version is considered, which allows for small errors in the boundary of K and the inequalities. Given a small error tolerance d>0, we say that:
The weak version also considers rational numbers, which have a representation of finite length, rather than arbitrary real numbers. A weak separation oracle for K is an oracle that, given a vector y in Qn and a rational number d>0, returns one of the following::[1]: 51
A special case of a convex set is a set represented by linear inequalities: K = { x | A x ≤ b } {\displaystyle K=\{x|Ax\leq b\}} . Such a set is called a convex polytope. A strong separation oracle for a convex polytope can be implemented, but its run-time depends on the input format.
If the matrix A and the vector b are given as input, so that K = { x | A x ≤ b } {\displaystyle K=\{x|Ax\leq b\}} , then a strong separation oracle can be implemented as follows.[2] Given a point y, compute A y {\displaystyle Ay} :
This oracle runs in polynomial time as long as the number of constraints is polynomial.
Suppose the set of vertices of K is given as an input, so that K = conv ( v 1 , … , v k ) = {\displaystyle K={\text{conv}}(v_{1},\ldots ,v_{k})=} the convex hull of its vertices. Then, deciding whether y is in K requires to check whether y is a convex combination of the input vectors, that is, whether there exist coefficients z1,...,zk such that: [1]: 49
This is a linear program with k variables and n equality constraints (one for each element of y). If y is not in K, then the above program has no solution, and the separation oracle needs to find a vector c such that
Note that the two above representations can be very different in size: it is possible that a polytope can be represented by a small number of inequalities, but has exponentially many vertices (for example, an n-dimensional cube). Conversely, it is possible that a polytope has a small number of vertices, but requires exponentially many inequalities (for example, the convex hull of the 2n vectors of the form (0,...,±1,...,0).
In some linear optimization problems, even though the number of constraints is exponential, one can still write a custom separation oracle that works in polynomial time. Some examples are:
Let f be a convex function on Rn. The set K = { ( x , t ) | f ( x ) ≤ t } {\displaystyle K=\{(x,t)|f(x)\leq t\}} is a convex set in Rn+1. Given an evaluation oracle for f (a black box that returns the value of f for every given point), one can easily check whether a vector (y, t) is in K. In order to get a separation oracle, we need also an oracle to evaluate the subgradient of f.[1]: 49 Suppose some vector (y, s) is not in K, so f(y) > s. Let g be the subgradient of f at y (g is a vector in Rn). Denote c := ( g , − 1 ) {\displaystyle c:=(g,-1)} .Then, c ⋅ ( y , s ) = g ⋅ y − s > g ⋅ y − f ( y ) {\displaystyle c\cdot (y,s)=g\cdot y-s>g\cdot y-f(y)} , and for all (x, t) in K: c ⋅ ( x , t ) = g ⋅ x − t ≤ g ⋅ x − f ( x ) {\displaystyle c\cdot (x,t)=g\cdot x-t\leq g\cdot x-f(x)} . By definition of a subgradient: f ( x ) ≥ f ( y ) + g ⋅ ( x − y ) {\displaystyle f(x)\geq f(y)+g\cdot (x-y)} for all x in Rn. Therefore, g ⋅ y − f ( y ) ≥ g ⋅ x − f ( x ) {\displaystyle g\cdot y-f(y)\geq g\cdot x-f(x)} , so c ⋅ ( y , s ) > c ⋅ ( x , t ) {\displaystyle c\cdot (y,s)>c\cdot (x,t)} , and c represents a separating hyperplane.
A strong separation oracle can be given as an input to the ellipsoid method for solving a linear program. Consider the linear program maximize c ⋅ x subject to A x ≤ b , x ≥ 0 {\displaystyle {\text{maximize}}~~c\cdot x~~{\text{subject to}}~~Ax\leq b,x\geq 0} . The ellipsoid method maintains an ellipsoid that initially contains the entire feasible domain A x ≤ b {\displaystyle Ax\leq b} . At each iteration t, it takes the center x t {\displaystyle x_{t}} of the current ellipsoid, and sends it to the separation oracle:
After making a cut, we construct a new, smaller ellipsoid, that contains the remaining region. It can be shown that this process converges to an approximate solution, in time polynomial in the required accuracy.
Given a weak separation oracle for a polyhedron, it is possible to construct a strong separation oracle by a careful method of rounding, or by diophantine approximations.[1]: 159