In computer science, the largest differencing method is an algorithm for solving the partition problem and the multiway number partitioning. It is also called the Karmarkar–Karp algorithm after its inventors, Narendra Karmarkar and Richard M. Karp.[1] It is often abbreviated as LDM.[2][3]
The input to the algorithm is a set S of numbers, and a parameter k. The required output is a partition of S into k subsets, such that the sums in the subsets are as nearly equal as possible. The main steps of the algorithm are:
For k=2, the main step (2) works as follows.
For example, if S = {8,7,6,5,4}, then the resulting difference-sets are {6,5,4,1} after taking out the largest two numbers {8,7} and inserting the difference 8-7=1 back; Repeat the steps and then we have {4,1,1}, then {3,1} then {2}.
Step 3 constructs the subsets in the partition by backtracking. The last step corresponds to {2},{}. Then 2 is replaced by 3 in one set and 1 in the other set: {3},{1}, then {4},{1,1}, then {4,5}, {1,6}, then {4,7,5}, {8,6}, where the sum-difference is indeed 2.
The runtime complexity of this algorithm is dominated by the step 1 (sorting), which takes O(n log n).
Note that this partition is not optimal: in the partition {8,7}, {6,5,4} the sum-difference is 0. However, there is evidence that it provides a "good" partition:
For any k ≥ 2, the algorithm can be generalized in the following way.[2]
Examples:
There is evidence for the good performance of LDM:[2]
Several variants of LDM were developed for the balanced number partitioning problem, in which all subsets must have the same cardinality (up to 1).
PDM (Paired Differencing Method) works as follows.[6]
PDM has average properties worse than LDM. For two-way partitioning, when inputs are uniformly-distributed random variables, the expected difference between largest and smallest sum is Θ ( 1 / n ) {\displaystyle \Theta (1/n)} .
RLDM (Restricted Largest Differencing Method) works as follows.[7]
For two-way partitioning, when inputs are uniformly-distributed random variables, the expected difference between largest and smallest sum is O ( log n / n 2 ) {\displaystyle O(\log {n}/n^{2})} .
BLDM (Balanced Largest Differencing Method) works as follows.[3]
BLDM has average properties similar to LDM. For two-way partitioning, when inputs are uniformly-distributed random variables, the expected difference between largest and smallest sum is n − Θ ( log n ) {\displaystyle n^{-\Theta (\log n)}} .[3]
For multi-way partitioning, when c=ceiling(n/k) and each of the k subsets must contain either ceiling(n/k) or floor(n/k) items, the approximation ratio of BLDM for the minimum largest sum is exactly 4/3 for c=3, 19/12 for c=4, 103/60 for c=5, 643/360 for c=6, and 4603/2520 for c=7. The ratios were found by solving a mixed integer linear program. In general (for any c), the approximation ratio is at least 2 − ∑ j = 0 c − 1 j ! c ! {\displaystyle 2-\sum _{j=0}^{c-1}{\frac {j!}{c!}}} and at most 2 − 1 c − 1 {\displaystyle 2-{\frac {1}{c-1}}} . The MILP results for 3,4,5,6,7 correspond to the lower bound. When the parameter is the number of subsets (k), the approximation ratio is exactly 2 − 1 k {\displaystyle 2-{\frac {1}{k}}} .[8]
In the min-max subsequence problem, the input is a multiset of n numbers and an integer parameter k, and the goal is to order the numbers such that the largest sum of each block of adjacent k numbers is as small as possible. The problem occurs in the design of video servers.[9] This problem can be solved in polytime for k=2, but it is strongly NP-hard for k≥3. A variance of the differencing method can applied to this problem.[10]
The complete Karmarkar–Karp algorithm (CKK) finds an optimal solution by constructing a tree of degree k ! {\displaystyle k!} .
For k=2, CKK runs substantially faster than the Complete Greedy Algorithm (CGA) on random instances. This is due to two reasons: when an equal partition does not exist, CKK often allows more trimming than CGA; and when an equal partition does exist, CKK often finds it much faster and thus allows earlier termination. Richard E. Korf reports that CKK can optimally partition 40 15-digit double-precision numbers in about 3 hours, while CGA requires about 9 hours. In practice, with k=2, problems of arbitrary size can be solved by CKK if the numbers have at most 12 significant digits; with k=3, at most 6 significant digits.[11]
CKK can also run as an anytime algorithm: it finds the KK solution first, and then finds progressively better solutions as time allows (possibly requiring exponential time to reach optimality, for the worst instances).[12]
Combining CKK with the balanced-LDM algorithm (BLDM) yields a complete anytime algorithm for solving the balanced partition problem.[13]
An algorithm equivalent to the Karmarkar-Karp differencing heuristic is mentioned in ancient Jewish legal texts by Nachmanides and Joseph ibn Habib. The algorithm is used to combine different testimonies about the same loan.[14]