better approximations for the minimum common integer partition problem
DESCRIPTION
Better Approximations for the Minimum Common Integer Partition Problem. David Woodruff. MIT and Tsinghua University. Approx 2006. Minimum Common Integer Partition. X = {x 1 , …, x r }, Y = {y 1 , …, y s } are multisets of positive integers. r ¸ s - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Better Approximations for the Minimum Common Integer Partition Problem](https://reader036.vdocuments.mx/reader036/viewer/2022062423/56814bf9550346895db8f0bd/html5/thumbnails/1.jpg)
Better Approximations for the Minimum Common Integer
Partition Problem
David Woodruff
Approx 2006
MIT and Tsinghua University
![Page 2: Better Approximations for the Minimum Common Integer Partition Problem](https://reader036.vdocuments.mx/reader036/viewer/2022062423/56814bf9550346895db8f0bd/html5/thumbnails/2.jpg)
Minimum Common Integer Partition
• X = {x1, …, xr}, Y = {y1, …, ys} are multisets of positive integers. r ¸ s
• Consider a partition of X into s subsets B1, …, Bs
• If there exist B1, …, Bs with b 2 Bi b = yi for all i, then X is an integer partition of Y. Think of X as a refinement of Y
• k-MCIP problem: Given Y1, …, Yk, find a smallest integer partition X of each of Y1, …, Yk
• Let m = i=1k |Yi|. Efficiency in terms of m.
![Page 3: Better Approximations for the Minimum Common Integer Partition Problem](https://reader036.vdocuments.mx/reader036/viewer/2022062423/56814bf9550346895db8f0bd/html5/thumbnails/3.jpg)
MCIP Example
Y1 = {2, 2, 3}, Y2 = {1, 1, 5}
Claim: {1, 1, 2, 3} = k-MCIP(Y1, Y2)
Proof: Partition 1: {1, 1}, {2}, {3} Partition 2: {1}, {1}, {2, 3} {1, 1, 2, 3} is an integer partition of Y1 and Y2
Any integer partition of both Y1, Y2 has size ¸ 4
![Page 4: Better Approximations for the Minimum Common Integer Partition Problem](https://reader036.vdocuments.mx/reader036/viewer/2022062423/56814bf9550346895db8f0bd/html5/thumbnails/4.jpg)
Applications
AAA-AAAAA-AA-AAA-AA-AAAA-AAA
{2,2,4,3} {3,5,2,1}
MCIP = {2, 3, 1, 2, 3}
Since |MCIP| small, humans and monkeys are similar(this measure has been proposed in practice [Jiang, et al])
![Page 5: Better Approximations for the Minimum Common Integer Partition Problem](https://reader036.vdocuments.mx/reader036/viewer/2022062423/56814bf9550346895db8f0bd/html5/thumbnails/5.jpg)
Applications
A-A-A-A-AA-A-AA-A-AAA-AA-AAAA-AAA
{2,2,4,3} {1,1,1,1,2,1,2,1,1}
MCIP = {1, 1, 1, 1, 1, 1, 1, 2, 2}
Since |MCIP| large, humans and mice are not similar
![Page 6: Better Approximations for the Minimum Common Integer Partition Problem](https://reader036.vdocuments.mx/reader036/viewer/2022062423/56814bf9550346895db8f0bd/html5/thumbnails/6.jpg)
Applications
• DNA fingerprint assembly– Oligonucleotide Fingerprinting Ribosomal
Genes Project [Valinsky, et al]– Goal is to identify microbial organisms – Use MCIP as a subroutine, k ¼ 28, m ¼ 212
[Jiang]
• Clustering? Scheduling?
![Page 7: Better Approximations for the Minimum Common Integer Partition Problem](https://reader036.vdocuments.mx/reader036/viewer/2022062423/56814bf9550346895db8f0bd/html5/thumbnails/7.jpg)
Previous Work
k-MCIP problem: Given Y1, …, Yk, find a smallest integer partition of each of Y1, …, Yk
[CLLJ] NP-hard (Maximum Set Packing)
APX-hard for every k ¸ 2 (Maximum-3-Dimensional Matching with Bounded Degree)
![Page 8: Better Approximations for the Minimum Common Integer Partition Problem](https://reader036.vdocuments.mx/reader036/viewer/2022062423/56814bf9550346895db8f0bd/html5/thumbnails/8.jpg)
Previous Work
[CLLJ] Upper Bounds (5/4)-approximation for k = 2Problem: (m9) running time (m ¼ 212 in practice)
(k-1/3)-approximation in generalProblems: (1) Large ratio (2) Unknown if there is a tight instance
![Page 9: Better Approximations for the Minimum Common Integer Partition Problem](https://reader036.vdocuments.mx/reader036/viewer/2022062423/56814bf9550346895db8f0bd/html5/thumbnails/9.jpg)
Our Contributions
• .614k + o(k) approximation– O(m log k) time– Extremely easy to implement– If Y1, …, Yk are disjoint, then (k+1)/2
approximation
• We show that the [CLLJ] k-1/3 approximation algorithm is actually a k-1/2 approximation, and this is tight
![Page 10: Better Approximations for the Minimum Common Integer Partition Problem](https://reader036.vdocuments.mx/reader036/viewer/2022062423/56814bf9550346895db8f0bd/html5/thumbnails/10.jpg)
Algorithm Overview
• Let A be an algorithm for 2-MCIP. We build an algorithm B for k-MCIP
• Choose a random set partition of {1, …, k} into pairs of integers
• For each pair (i,j) 2 , let Ai,j = A(Yi, Yj)
• If there is only one pair (1,2) 2 , output A1,2, otherwise recurse on multisets Ai,j with (i,j) 2
![Page 11: Better Approximations for the Minimum Common Integer Partition Problem](https://reader036.vdocuments.mx/reader036/viewer/2022062423/56814bf9550346895db8f0bd/html5/thumbnails/11.jpg)
2-MCIP Algorithm
• What is the algorithm for 2-MCIP?
• Greedy algorithm
3422
1253
Y1:
Y2:
Choose two integersTake the minimumSubtract the minimum from both integers and append it to the output
1
0
Remove all 0s
3213
Output
Repeat|Greedy(Y1, Y2)| < |Y1| + |Y2|Generalization: Greedy(Y1, …, Yk) · i=1k |Yi| = m
![Page 12: Better Approximations for the Minimum Common Integer Partition Problem](https://reader036.vdocuments.mx/reader036/viewer/2022062423/56814bf9550346895db8f0bd/html5/thumbnails/12.jpg)
Better 2-MCIP Algorithm• CommonElements algorithm for 2-MCIP of Y1, Y2:
• T Ã ;. While there is a common integer x of Y1 and Y2, T Ã T [ x Y1 Ã Y1 n x Y2 Ã Y2 n x
• Output T [ Greedy(Y1, Y2)
• Let c1,2 be the # of common integers of Y1 and Y2
• |CommonElements(Y1, Y2)| · (|Y1| + |Y2| - 2c1,2) + c1,2
= |Y1| + |Y2| - c1,2
![Page 13: Better Approximations for the Minimum Common Integer Partition Problem](https://reader036.vdocuments.mx/reader036/viewer/2022062423/56814bf9550346895db8f0bd/html5/thumbnails/13.jpg)
Algorithm Recap
• Choose a random set partition of {1, …, k} into pairs of integers
• For each pair (i,j) 2 , let Ai,j = CommonElements(Yi, Yj)
• If there is only one pair (1,2) 2 , output A1,2, otherwise recurse on multisets Ai,j with (i,j) 2
![Page 14: Better Approximations for the Minimum Common Integer Partition Problem](https://reader036.vdocuments.mx/reader036/viewer/2022062423/56814bf9550346895db8f0bd/html5/thumbnails/14.jpg)
Analysis
• Lower bound the output size of our algorithm as a function of the frequency of different integers
• Find the expected output size as a function of the frequency of different integers
• Divide these two to get a worst-case (expected) ratio
• Derandomize using conditional expectations
![Page 15: Better Approximations for the Minimum Common Integer Partition Problem](https://reader036.vdocuments.mx/reader036/viewer/2022062423/56814bf9550346895db8f0bd/html5/thumbnails/15.jpg)
Frequency of Integers
Define the r-redundancy Red(r) to capture integer frequencies
13132
11125
11341
Consider r disjoint multisets A1, …, Ar such that 1. Each Ai intersects at most one input multiset 2. Ai only contains 1 distinct integer
Red(r) is maxA1, …, Ar i=1r |Ai|
Y1
Y2
Y3
![Page 16: Better Approximations for the Minimum Common Integer Partition Problem](https://reader036.vdocuments.mx/reader036/viewer/2022062423/56814bf9550346895db8f0bd/html5/thumbnails/16.jpg)
Lower BoundOpt is the size of k-MCIP
Elements of Y1 , Y2, …, Yk
There are opt right vertices each of
degree k
Elements ofk-MCIP
A left vertex is joined to elements partitioning it
5 2
3
# degree-1 vertices on the left is · Red(opt).So, # edges is ¸ 1¢Red(opt) + 2¢(m – Red(opt)).
But, # edges is exactly k¢opt.So, k ¢ opt ¸ 2m – Red(opt)
![Page 17: Better Approximations for the Minimum Common Integer Partition Problem](https://reader036.vdocuments.mx/reader036/viewer/2022062423/56814bf9550346895db8f0bd/html5/thumbnails/17.jpg)
Example
• Our bound is k ¢ opt ¸ 2m – Red(opt)
• If input multisets are disjoint, Red(opt)=opt
• Trivial greedy algorithm has output size · m
• So greedy algorithm is a m/opt = (k+1)/2 approximation
![Page 18: Better Approximations for the Minimum Common Integer Partition Problem](https://reader036.vdocuments.mx/reader036/viewer/2022062423/56814bf9550346895db8f0bd/html5/thumbnails/18.jpg)
Algorithm Recap
• Choose a random set partition of {1, …, k} into pairs of integers
• For each pair (i,j) 2 , let Ai,j = CommonElements(Yi, Yj)
• If there is only one pair (1,2) 2 , output A1,2, otherwise recurse on multisets Ai,j with (i,j) 2
![Page 19: Better Approximations for the Minimum Common Integer Partition Problem](https://reader036.vdocuments.mx/reader036/viewer/2022062423/56814bf9550346895db8f0bd/html5/thumbnails/19.jpg)
Upper Bound
• In some recursive call on multisets Ya and Yb, we are interested in the number of common elements of Ya, Yb
• Since we choose a random partition of input multisets, we can bound the expected number of common elements as a function of Red(opt)
• Linearity of expectations and some calculus allows us to bound the expected number of common elements encountered over all recursive calls, in terms of Red(opt)
• Use lower bound in terms of Red(opt) to get overall ratio
![Page 20: Better Approximations for the Minimum Common Integer Partition Problem](https://reader036.vdocuments.mx/reader036/viewer/2022062423/56814bf9550346895db8f0bd/html5/thumbnails/20.jpg)
Upper Bound
• Each of O(log k) recursive calls can be implemented in O(m) time, so O(m log k) time
• Actually, proof shows that only 3 recursive calls are necessary to get .614k + o(k) approximation
• This allows derandomization using conditional expectations in O(m poly(k)) time
![Page 21: Better Approximations for the Minimum Common Integer Partition Problem](https://reader036.vdocuments.mx/reader036/viewer/2022062423/56814bf9550346895db8f0bd/html5/thumbnails/21.jpg)
Conclusions and Future Work
• .614k + o(k) approximation in O(m log k) time
• Improve analysis of previous best algorithm, showing it has ratio exactly k-1/2. – Upper bound uses our notion of redundancy– Lower bound uses an adversarial argument
• Best known lower bound is (1), so there is a huge gap.
![Page 22: Better Approximations for the Minimum Common Integer Partition Problem](https://reader036.vdocuments.mx/reader036/viewer/2022062423/56814bf9550346895db8f0bd/html5/thumbnails/22.jpg)
Another Example• Consider algorithm which repeatedly removes an integer
common to all k input multisets, and then runs a greedy algorithm on the remaining multisets [CLLJ06]
• Suppose r common integers are removed. Then output size · (m-rk) + r
• But Red(opt) · rk + (opt – r)(k-1). • Our bound is k ¢ opt ¸ 2m – Red(opt)
• This implies opt ¸ (2m-r)/(2k-1), and (m-rk+r)/opt · k – ½.
• Using an adversarial argument, can show this is tight