The Karmarkar–Karp (KK) bin packing algorithms are several related approximation algorithm for the bin packing problem.[1] The bin packing problem is a problem of packing items of different sizes into bins of identical capacity, such that the total number of bins is as small as possible. Finding the optimal solution is computationally hard. Karmarkar and Karp devised an algorithm that runs in polynomial time and finds a solution with at most O P T + O ( log 2 ( O P T ) ) {\displaystyle \mathrm {OPT} +{\mathcal {O}}(\log ^{2}(OPT))} bins, where OPT is the number of bins in the optimal solution. They also devised several other algorithms with slightly different approximation guarantees and run-time bounds.
The KK algorithms were considered a breakthrough in the study of bin packing: the previously-known algorithms found multiplicative approximation, where the number of bins was at most r ⋅ O P T + s {\displaystyle r\cdot \mathrm {OPT} +s} for some constants r > 1 , s > 0 {\displaystyle r>1,s>0} , or at most ( 1 + ε ) O P T + 1 {\displaystyle (1+\varepsilon )\mathrm {OPT} +1} .[2] The KK algorithms were the first ones to attain an additive approximation.
The input to a bin-packing problem is a set of items of different sizes, a1,...an. The following notation is used:
Given an instance I, we denote:
Obviously, FOPT(I) ≤ OPT(I).
The KK algorithms essentially solve the configuration linear program:
minimize 1 ⋅ x s.t. A x ≥ n and x ≥ 0 and x is an integer {\displaystyle {\text{minimize}}~~\mathbf {1} \cdot \mathbf {x} ~~~{\text{s.t.}}~~A\mathbf {x} \geq \mathbf {n} ~~~{\text{and}}~~\mathbf {x} \geq 0~~~{\text{and}}~~\mathbf {x} ~{\text{is an integer}}~} .
Here, A is a matrix with m rows. Each column of A represents a feasible configuration - a multiset of item-sizes, such that the sum of all these sizes is at most B. The set of configurations is C. x is a vector of size C. Each element xc of x represents the number of times configuration c is used.
There are two main difficulties in solving this problem. First, it is an integer linear program, which is computationally hard to solve. Second, the number of variables is C - the number of configurations, which may be enormous. The KK algorithms cope with these difficulties using several techniques, some of which were already introduced by de-la-Vega and Lueker.[2] Here is a high-level description of the algorithm (where I {\displaystyle I} is the original instance):
Below, we describe each of these steps in turn.
The motivation for removing small items is that, when all items are large, the number of items in each bin must be small, so the number of possible configurations is (relatively) small. We pick some constant g ∈ ( 0 , 1 ) {\displaystyle g\in (0,1)} , and remove from the original instance I {\displaystyle I} all items smaller than g ⋅ B {\displaystyle g\cdot B} . Let J {\displaystyle J} be the resulting instance. Note that in J {\displaystyle J} , each bin can contain at most 1 / g {\displaystyle 1/g} items. We pack J {\displaystyle J} and get a packing with some b J {\displaystyle b_{J}} bins.
Now, we add the small items into the existing bins in an arbitrary order, as long as there is room. When there is no more room in the existing bins, we open a new bin (as in next-fit bin packing). Let b I {\displaystyle b_{I}} be the number of bins in the final packing. Then:
b I ≤ max ( b J , ( 1 + 2 g ) ⋅ O P T ( I ) + 1 ) {\displaystyle b_{I}\leq \max(b_{J},(1+2g)\cdot OPT(I)+1)} .
Proof. If no new bins are opened, then the number of bins remains b J {\displaystyle b_{J}} . If a new bin is opened, then all bins except maybe the last one contain a total size of at least B − g ⋅ B {\displaystyle B-g\cdot B} , so the total instance size is at least ( 1 − g ) ⋅ B ⋅ ( b I − 1 ) {\displaystyle (1-g)\cdot B\cdot (b_{I}-1)} . Therefore, F O P T ≥ ( 1 − g ) ⋅ ( b I − 1 ) {\displaystyle FOPT\geq (1-g)\cdot (b_{I}-1)} , so the optimal solution needs at least ( 1 − g ) ⋅ ( b I − 1 ) {\displaystyle (1-g)\cdot (b_{I}-1)} bins. So b I ≤ O P T / ( 1 − g ) + 1 = ( 1 + g + g 2 + … ) O P T + 1 ≤ ( 1 + 2 g ) O P T + 1 {\displaystyle b_{I}\leq OPT/(1-g)+1=(1+g+g^{2}+\ldots )OPT+1\leq (1+2g)OPT+1} . In particular, by taking g=1/n, we get:
b I ≤ max ( b J , O P T + 2 ⋅ O P T ( I ) / n + 1 ) ≤ max ( b J , O P T + 3 ) {\displaystyle b_{I}\leq \max(b_{J},OPT+2\cdot OPT(I)/n+1)\leq \max(b_{J},OPT+3)} ,
since O P T ( I ) ≤ n {\displaystyle OPT(I)\leq n} . Therefore, it is common to assume that all items are larger than 1/n.[4]
The motivation for grouping items is to reduce the number of different item sizes, to reduce the number of constraints in the configuration LP. The general grouping process is:
There are several different grouping methods.
Let k > 1 {\displaystyle k>1} be an integer parameter. Put the largest k {\displaystyle k} items in group 1; the next-largest k {\displaystyle k} items in group 2; and so on (the last group might have fewer than k {\displaystyle k} items). Let J {\displaystyle J} be the original instance. Let K ′ {\displaystyle K'} be the first group (the group of the k {\displaystyle k} largest items), and K {\displaystyle K} the grouped instance without the first group. Then:
Therefore, O P T ( J ) ≤ O P T ( K ∪ K ′ ) ≤ O P T ( K ) + O P T ( K ′ ) ≤ O P T ( K ) + k {\displaystyle OPT(J)\leq OPT(K\cup K')\leq OPT(K)+OPT(K')\leq OPT(K)+k} . Indeed, given a solution to K {\displaystyle K} with b K {\displaystyle b_{K}} bins, we can get a solution to J {\displaystyle J} with at most b K + k {\displaystyle b_{K}+k} bins.
Let k > 1 {\displaystyle k>1} be an integer parameter. Geometric grouping proceeds in two steps:
Then, the number of different sizes is bounded as follows:
The number of bins is bounded as follows:
Let k > 1 {\displaystyle k>1} be an integer parameter. Order the items by descending size. Partition them into groups such that the total size in each group is at least k ⋅ B {\displaystyle k\cdot B} . Since the size of each item is less than B, The number of items in each group is at least k + 1 {\displaystyle k+1} . The number of items in each group is weakly-increasing. If all items are larger than g ⋅ B {\displaystyle g\cdot B} , then the number of items in each group is at most k / g {\displaystyle k/g} . In each group, only the larger items are rounded up. This can be done such that:
We consider the configuration linear program without the integrality constraints:
minimize 1 ⋅ x s.t. A x ≥ n and x ≥ 0 {\displaystyle {\text{minimize}}~~\mathbf {1} \cdot \mathbf {x} ~~~{\text{s.t.}}~~A\mathbf {x} \geq \mathbf {n} ~~~{\text{and}}~~\mathbf {x} \geq 0} .
Here, we are allowed to use a fractional number of each configuration.
Denote the optimal solution of the linear program by LOPT. The following relations are obvious:
A solution to the fractional LP can be rounded to an integral solution as follows.
Suppose we have a solution x to the fractional LP. We round x into a solution for the integral ILP as follows.
This also implies that O P T ( I ) ≤ L O P T ( I ) + m / 2 {\displaystyle OPT(I)\leq LOPT(I)+m/2} .
The main challenge in solving the fractional LP is that it may have a huge number of variables - a variable for each possible configuration.
The dual linear program of the fractional LP is:
maximize n ⋅ y s.t. A T y ≤ 1 and y ≥ 0 {\displaystyle {\text{maximize}}~~\mathbf {n} \cdot \mathbf {y} ~~~{\text{s.t.}}~~A^{T}\mathbf {y} \leq \mathbf {1} ~~~{\text{and}}~~\mathbf {y} \geq 0} .
It has m variables y 1 , … , y m {\displaystyle y_{1},\ldots ,y_{m}} , and C constraints - a constraint for each configuration. It has the following economic interpretation. For each size s, we should determine a nonnegative price y i {\displaystyle y_{i}} . Our profit is the total price of all items. We want to maximize the profit n y subject to the constraints that the total price of items in each configuration is at most 1. This LP now has only m variables, but a huge number of constraints. Even listing all the constraints is infeasible.
Fortunately, it is possible to solve the problem up to any given precision without listing all the constraints, by using a variant of the ellipsoid method. This variant gets as input, a separation oracle: a function that, given a vector y ≥ 0, returns one of the following two options:
The ellipsoid method starts with a large ellipsoid, that contains the entire feasible domain A T y ≤ 1 {\displaystyle A^{T}\mathbf {y} \leq \mathbf {1} } . At each step t, it takes the center y t {\displaystyle \mathbf {y} _{t}} of the current ellipsoid, and sends it to the separation oracle:
After making a cut, we construct a new, smaller ellipsoid. It can be shown that this process converges to an approximate solution, in time polynomial in the required accuracy.
We are given some m non-negative numbers y 1 , … , y m {\displaystyle y_{1},\ldots ,y_{m}} . We have to decide between the following two options:
This problem can be solved by solving a knapsack problem, where the item values are y 1 , … , y m {\displaystyle y_{1},\ldots ,y_{m}} , the item weights are s 1 , … , s m {\displaystyle s_{1},\ldots ,s_{m}} , and the weight capacity is B (the bin size).
The knapsack problem can be solved by dynamic programming in pseudo-polynomial time: O ( m ⋅ V ) {\displaystyle O(m\cdot V)} , where m is the number of inputs and V is the number of different possible values. To get a polynomial-time algorithm, we can solve the knapsack problem approximately, using input rounding. Suppose we want a solution with tolerance δ {\displaystyle \delta } . We can round each of y 1 , … , y m {\displaystyle y_{1},\ldots ,y_{m}} down to the nearest multiple of δ {\displaystyle \delta } /n. Then, the number of possible values between 0 and 1 is n/ δ {\displaystyle \delta } , and the run-time is O ( m n / δ ) {\displaystyle O(mn/\delta )} . The solution is at least the optimal solution minus δ {\displaystyle \delta } /n.
The ellipsoid method should be adapted to use an approximate separation oracle. Given the current ellipsoid center y t {\displaystyle \mathbf {y} _{t}} :
Using the approximate separation oracle gives a feasible solution y* to the dual LP, with n ⋅ y ∗ ≥ L O P T − δ {\displaystyle \mathbf {n} \cdot \mathbf {y^{*}} \geq LOPT-\delta } , after at most Q {\displaystyle Q} iterations, where Q = 4 m 2 ln ( m n / g δ ) {\displaystyle Q=4m^{2}\ln(mn/g\delta )} . The total run-time of the ellipsoid method with the approximate separation oracle is O ( Q m n / δ ) {\displaystyle O(Qmn/\delta )} .
During the ellipsoid method, we use at most Q constraints of the form a ⋅ y ≤ 1 {\displaystyle \mathbf {a} \cdot \mathbf {y} \leq 1} . All the other constraints can be eliminated, since they have no effect on the outcome y* of the ellipsoid method. We can eliminate even more constraints. It is known that, in any LP with m variables, there is a set of m constraints that is sufficient for determining the optimal solution (that is, the optimal value is the same even if only these m constraints are used). We can repeatedly run the ellipsoid method as above, each time trying to remove a specific set of constraints. If the resulting error is at most δ {\displaystyle \delta } , then we remove these constraints permanently. It can be shown that we need at most ≈ ( Q / m ) + m ln ( Q / m ) {\displaystyle \approx (Q/m)+m\ln(Q/m)} eliminations, so the accumulating error is at most ≈ δ ⋅ [ ( Q / m ) + m ln ( Q / m ) ] {\displaystyle \approx \delta \cdot [(Q/m)+m\ln(Q/m)]} . If we try sets of constraints deterministically, then in the worst case, one out of m trials succeeds, so we need to run the ellipsoid method at most ≈ m ⋅ [ ( Q / m ) + m ln ( Q / m ) ] = Q + m 2 ln ( Q / m ) {\displaystyle \approx m\cdot [(Q/m)+m\ln(Q/m)]=Q+m^{2}\ln(Q/m)} times. If we choose the constraints to remove at random, then the expected number of iterations is O ( m ) ⋅ [ 1 + ln ( Q / m ) ] {\displaystyle O(m)\cdot [1+\ln(Q/m)]} .
Finally, we have a reduced dual LP, with only m variables and m constraints. The optimal value of the reduced LP is at least L O P T − h {\displaystyle LOPT-h} , where h ≈ δ ⋅ [ ( Q / m ) + m ln ( Q / m ) ] {\displaystyle h\approx \delta \cdot [(Q/m)+m\ln(Q/m)]} .
By the LP duality theorem, the minimum value of the primal LP equals the maximum value of the dual LP, which we denoted by LOPT. Once we have a reduced dual LP, we take its dual, and take a reduced primal LP. This LP has only m variables - corresponding to only m out of C configurations. The maximum value of the reduced dual LP is at least L O P T − h {\displaystyle LOPT-h} . It can be shown[clarification needed] that the optimal solution of the reduced primal LP is at most L O P T + h {\displaystyle LOPT+h} . The solution gives a near-optimal bin packing, using at most m configurations.
The total run-time of the deterministic algorithm, when all items are larger than g ⋅ B {\displaystyle g\cdot B} , is:
O ( Q m n δ ⋅ ( Q + m 2 ln Q m ) ) = O ( Q 2 m n + Q m 3 n ln Q m δ ) ≈ O ( m 8 ln m ln 2 ( m n g h ) + m 4 n ln m h ln ( m n g h ) ) {\displaystyle O\left({\frac {Qmn}{\delta }}\cdot (Q+m^{2}\ln {\frac {Q}{m}})\right)=O\left({\frac {Q^{2}mn+Qm^{3}n\ln {\frac {Q}{m}}}{\delta }}\right)\approx O\left(m^{8}\ln {m}\ln ^{2}({\frac {mn}{gh}})+{\frac {m^{4}n\ln {m}}{h}}\ln({\frac {mn}{gh}})\right)} ,
The expected total run-time of the randomized algorithm is: O ( m 7 log m log 2 ( m n g h ) + m 4 n log m h log ( m n g h ) ) {\displaystyle O\left(m^{7}\log {m}\log ^{2}({\frac {mn}{gh}})+{\frac {m^{4}n\log {m}}{h}}\log({\frac {mn}{gh}})\right)} .
Karmarkar and Karp presented three algorithms, that use the above techniques with different parameters. The run-time of all these algorithms depends on a function T ( ⋅ , ⋅ ) {\displaystyle T(\cdot ,\cdot )} , which is a polynomial function describing the time it takes to solve the fractional LP with tolerance h=1, which is, for the deterministic version, T ( m , n ) ∈ O ( m 8 log m log 2 n + m 4 n log m log n ) {\displaystyle T(m,n)\in O(m^{8}\log {m}\log ^{2}{n}+m^{4}n\log {m}\log {n})} .
Let ϵ > 0 {\displaystyle \epsilon >0} be a constant representing the desired approximation accuracy.
All in all, the number of bins is in ( 1 + ϵ ) O P T + O ( ϵ − 2 ) {\displaystyle (1+\epsilon )OPT+O(\epsilon ^{-2})} and the run-time is in O ( n log n + T ( ϵ − 2 , n ) ) {\displaystyle O(n\log n+T(\epsilon ^{-2},n))} . By choosing ϵ = O P T − 1 / 3 {\displaystyle \epsilon =OPT^{-1/3}} we get O P T + O ( O P T 2 / 3 ) {\displaystyle OPT+O(OPT^{2/3})} .
Let g > 0 {\displaystyle g>0} be a real parameter and k > 0 {\displaystyle k>0} an integer parameter to be determined later.
The run-time is in O ( n log n + T ( F O P T ( J ) / k + ln ( 1 / g ) , n ) ) {\displaystyle O(n\log n+T(FOPT(J)/k+\ln(1/g),n))} .
Now, if we choose k=2 and g=1/FOPT(I), we get:
b J ≤ O P T + O ( log 2 ( F O P T ) ) {\displaystyle b_{J}\leq OPT+O(\log ^{2}(FOPT))} ,
and hence:
b I ≤ max ( b J , O P T + 2 O P T / F O P T + 1 ) ≤ max ( b J , O P T + 5 ) ∈ O P T + log 2 ( O P T ) {\displaystyle b_{I}\leq \max(b_{J},OPT+2OPT/FOPT+1)\leq \max(b_{J},OPT+5)\in OPT+\log ^{2}(OPT)} ,
so the total number of bins is in O P T + O ( log 2 ( F O P T ) ) {\displaystyle OPT+O(\log ^{2}(FOPT))} . The run-time is O ( n log n ) + T ( F O P T / 2 + ln ( F O P T ) , n ) ∈ O ( n log n + T ( F O P T , n ) ) {\displaystyle O(n\log n)+T(FOPT/2+\ln(FOPT),n)\in O(n\log {n}+T(FOPT,n))} .
The same algorithm can be used with different parameters to trade-off run-time with accuracy. For some parameter α ∈ ( 0 , 1 ) {\displaystyle \alpha \in (0,1)} , choose k = F O P T α {\displaystyle k=FOPT^{\alpha }} and g = 1 / F O P T 1 − α {\displaystyle g=1/FOPT^{1-\alpha }} . Then, the packing needs at most O P T + O ( O P T α ) {\displaystyle \mathrm {OPT} +{\mathcal {O}}(OPT^{\alpha })} bins, and the run-time is in O ( n log n + T ( F O P T ( 1 − α ) , n ) ) {\displaystyle O(n\log {n}+T(FOPT^{(1-\alpha )},n))} .
The third algorithm is useful when the number of sizes m is small (see also high-multiplicity bin packing).
It uses at most O P T + O ( log 2 m ) {\displaystyle \mathrm {OPT} +{\mathcal {O}}(\log ^{2}m)} bins, and the run-time is in O ( n log n + T ( m , n ) ) {\displaystyle O(n\log {n}+T(m,n))} .
The KK techniques were improved later, to provide even better approximations.
Rothvoss[4] uses the same scheme as Algorithm 2, but with a different rounding procedure in Step 2. He introduced a "gluing" step, in which small items are glued together to yield a single larger item. This gluing can be used to increase the smallest item size to about B / log 12 ( n ) {\displaystyle B/\log ^{12}(n)} . When all sizes are at least B / log 12 ( n ) {\displaystyle B/\log ^{12}(n)} , we can substitute g = 1 / log 12 ( n ) {\displaystyle g=1/\log ^{12}(n)} in the guarantee of Algorithm 2, and get:
b J ≤ O P T ( I ) + O ( log ( F O P T ) log ( log ( n ) ) ) {\displaystyle b_{J}\leq OPT(I)+O(\log(FOPT)\log(\log(n)))} ,
which yields a O P T + O ( log ( O P T ) ⋅ log log ( O P T ) ) {\displaystyle \mathrm {OPT} +O(\log(\mathrm {OPT} )\cdot \log \log(\mathrm {OPT} ))} bins.
Hoberg and Rothvoss[5] use a similar scheme in which the items are first packed into "containers", and then the containers are packed into bins. Their algorithm needs at most b J ≤ O P T ( I ) + O ( log ( O P T ) ) {\displaystyle b_{J}\leq OPT(I)+O(\log(OPT))} bins.