1 distributed monotonicity restoration michael saks rutgers university c. seshadhri princeton...
DESCRIPTION
3 Data Sets Data set = function f : Γ V Γ = finite index set V = value set In this talk, Γ = [n] d = {1,…,n} d f is a d-dimensional arrayTRANSCRIPT
![Page 1: 1 Distributed Monotonicity Restoration Michael Saks Rutgers University C. Seshadhri Princeton University](https://reader033.vdocuments.mx/reader033/viewer/2022052712/5a4d1bbd7f8b9ab0599d17b6/html5/thumbnails/1.jpg)
1
Distributed Monotonicity Restoration
Michael SaksRutgers University
C. SeshadhriPrinceton University
![Page 2: 1 Distributed Monotonicity Restoration Michael Saks Rutgers University C. Seshadhri Princeton University](https://reader033.vdocuments.mx/reader033/viewer/2022052712/5a4d1bbd7f8b9ab0599d17b6/html5/thumbnails/2.jpg)
2
Overview Introduce a new class of algorithmic
problems:Distributed Property Restoration
A solution for the case of theMonotonicity Property
![Page 3: 1 Distributed Monotonicity Restoration Michael Saks Rutgers University C. Seshadhri Princeton University](https://reader033.vdocuments.mx/reader033/viewer/2022052712/5a4d1bbd7f8b9ab0599d17b6/html5/thumbnails/3.jpg)
3
Data Sets
Data set = function f : Γ V
Γ = finite index setV = value set
In this talk,Γ = [n]d = {1,…,n}d
f is a d-dimensional array
![Page 4: 1 Distributed Monotonicity Restoration Michael Saks Rutgers University C. Seshadhri Princeton University](https://reader033.vdocuments.mx/reader033/viewer/2022052712/5a4d1bbd7f8b9ab0599d17b6/html5/thumbnails/4.jpg)
4
Data Sets: Examples
Directed Graphsboolean valued matrix
This talk: Nonnegative integer-valued arrays
![Page 5: 1 Distributed Monotonicity Restoration Michael Saks Rutgers University C. Seshadhri Princeton University](https://reader033.vdocuments.mx/reader033/viewer/2022052712/5a4d1bbd7f8b9ab0599d17b6/html5/thumbnails/5.jpg)
5
For f,g with common domain Γ:
dist(f,g) = fraction of domain where f(x) ≠ g(x)
= relative Hamming distance
Distance between two data sets
![Page 6: 1 Distributed Monotonicity Restoration Michael Saks Rutgers University C. Seshadhri Princeton University](https://reader033.vdocuments.mx/reader033/viewer/2022052712/5a4d1bbd7f8b9ab0599d17b6/html5/thumbnails/6.jpg)
6
Properties of data setsExamples
Graphs: planarity, Hamiltonicity, etc.
For multidimensional arrays Linear mod n Distinct entries Monotone: nondecreasing along every line
(Order preserving) When d=1, monotone = sorted
![Page 7: 1 Distributed Monotonicity Restoration Michael Saks Rutgers University C. Seshadhri Princeton University](https://reader033.vdocuments.mx/reader033/viewer/2022052712/5a4d1bbd7f8b9ab0599d17b6/html5/thumbnails/7.jpg)
7
Some Algorithmic problems for PGiven data set f (as input): Recognition: Does f satisfy P? Testing:
(Define ε(f) = min{ dist(f,g) : g satisfies P})Decide either ε(f) > 0: f does not satisfy P ε(f) ≤ δ: f is close to P
(If 0 < ε(f) ≤ δ then can decide either)
![Page 8: 1 Distributed Monotonicity Restoration Michael Saks Rutgers University C. Seshadhri Princeton University](https://reader033.vdocuments.mx/reader033/viewer/2022052712/5a4d1bbd7f8b9ab0599d17b6/html5/thumbnails/8.jpg)
8
Property RestorationSetting:
Given f We expect f to satisfy P
(e.g. we run algorithms on f that assume P) but f may not satisfy P
Restoration problem for P: Given data set f, produce data set g that satisfies P is close to f: d(f,g) is not much bigger than ε(f)
![Page 9: 1 Distributed Monotonicity Restoration Michael Saks Rutgers University C. Seshadhri Princeton University](https://reader033.vdocuments.mx/reader033/viewer/2022052712/5a4d1bbd7f8b9ab0599d17b6/html5/thumbnails/9.jpg)
9
What does it mean to produce g? Offline computation
Input: function table for f
Output: function table for g
![Page 10: 1 Distributed Monotonicity Restoration Michael Saks Rutgers University C. Seshadhri Princeton University](https://reader033.vdocuments.mx/reader033/viewer/2022052712/5a4d1bbd7f8b9ab0599d17b6/html5/thumbnails/10.jpg)
10
Distributed monotonicity restoration For each domain value x in Γ,
Processor Px computes g(x) works independently of other processors may access f(y) for (not too many) y has access to a short random string s
(common to all processors)
and is otherwise deterministic.
![Page 11: 1 Distributed Monotonicity Restoration Michael Saks Rutgers University C. Seshadhri Princeton University](https://reader033.vdocuments.mx/reader033/viewer/2022052712/5a4d1bbd7f8b9ab0599d17b6/html5/thumbnails/11.jpg)
11
Distributed Property RestorationGoal:
WHP (with probability close to 1) (over choices of random string s):
g has property P d(g,f) = O( ε(f) ) Each Px runs quickly
in particular only reads f(y) for a small number of y.
![Page 12: 1 Distributed Monotonicity Restoration Michael Saks Rutgers University C. Seshadhri Princeton University](https://reader033.vdocuments.mx/reader033/viewer/2022052712/5a4d1bbd7f8b9ab0599d17b6/html5/thumbnails/12.jpg)
12
Distributed Property Restoration
Precursor to this work: Online Data Reconstruction Model
(Ailon, Chazelle, Liu, Seshadhri)
![Page 13: 1 Distributed Monotonicity Restoration Michael Saks Rutgers University C. Seshadhri Princeton University](https://reader033.vdocuments.mx/reader033/viewer/2022052712/5a4d1bbd7f8b9ab0599d17b6/html5/thumbnails/13.jpg)
13
Example: Error Correcting Codes Data set f = boolean string of length n
Property = Code word of a given error correcting code C
Recognition: Does f belong to C?Restoration = Decoding to a close code wordDistributed restoration = Local decoding
![Page 14: 1 Distributed Monotonicity Restoration Michael Saks Rutgers University C. Seshadhri Princeton University](https://reader033.vdocuments.mx/reader033/viewer/2022052712/5a4d1bbd7f8b9ab0599d17b6/html5/thumbnails/14.jpg)
14
Monotonicity Restoration: d=1 f is a linear array of length n
First attempt at distributed restoration: Px looks at f(x) and f(x-1)
If f(x) ≥ f(x-1),then g(x) = f(x)
Otherwise, we have a non-monotonicity
g(x) = max { f(x) , f(x-1) }
![Page 15: 1 Distributed Monotonicity Restoration Michael Saks Rutgers University C. Seshadhri Princeton University](https://reader033.vdocuments.mx/reader033/viewer/2022052712/5a4d1bbd7f8b9ab0599d17b6/html5/thumbnails/15.jpg)
15
Monotonicity Restoration: d=1 Second attempt
Set g(x) = max{ f(1), f(2),…, f(x) }
g is monotone but
Px requires time Ω(x) dist(g,f) may be much larger than ε(f)
![Page 16: 1 Distributed Monotonicity Restoration Michael Saks Rutgers University C. Seshadhri Princeton University](https://reader033.vdocuments.mx/reader033/viewer/2022052712/5a4d1bbd7f8b9ab0599d17b6/html5/thumbnails/16.jpg)
16
Our results (for general d )A distributed monotonicity restoration algorithm
for general dimension d such that:
Time to compute g(x) is (log n)O(d)
dist(f,g) = C1(D) f) Shared random string s has size (d log n)O(1)
(Builds on similar results of Ailon, et al for Online Monotonicity reconstruction.)
![Page 17: 1 Distributed Monotonicity Restoration Michael Saks Rutgers University C. Seshadhri Princeton University](https://reader033.vdocuments.mx/reader033/viewer/2022052712/5a4d1bbd7f8b9ab0599d17b6/html5/thumbnails/17.jpg)
17
Which array values should be changed?A subset S of Γ is f-monotone
if f restricted to S is monotone.
For each x in Γ, Px must: Decide whether g(x) = f(x) If not , then determine g(x)
Preserved = { x : g(x) = f(x) }Corrected = { x : g(x) ≠ f(x) }
In particular, Preserved must be f-monotone
![Page 18: 1 Distributed Monotonicity Restoration Michael Saks Rutgers University C. Seshadhri Princeton University](https://reader033.vdocuments.mx/reader033/viewer/2022052712/5a4d1bbd7f8b9ab0599d17b6/html5/thumbnails/18.jpg)
18
Identifying Preserved
The partition (Preserved, Corrected)must satisfy:
Preserved is f-monotone |Corrected|/|Γ| = O(ε(f))
Preliminary algorithmic problem:
![Page 19: 1 Distributed Monotonicity Restoration Michael Saks Rutgers University C. Seshadhri Princeton University](https://reader033.vdocuments.mx/reader033/viewer/2022052712/5a4d1bbd7f8b9ab0599d17b6/html5/thumbnails/19.jpg)
19
Classification problem
Classify each y in Γ as Green or Red Green is f - monotone Red has size O(ε(f)|Γ|)
Need subroutine Classify(y).
![Page 20: 1 Distributed Monotonicity Restoration Michael Saks Rutgers University C. Seshadhri Princeton University](https://reader033.vdocuments.mx/reader033/viewer/2022052712/5a4d1bbd7f8b9ab0599d17b6/html5/thumbnails/20.jpg)
20
A sufficient condition for f-monotonicityA pair (x,y) in Γ × Γ is a violation if
x < y and f(x) > f(y)
To guarantee that Green is f - monotone:
Red should hit all violations:
For every violation (x,y) at least one of x,y is Red
![Page 21: 1 Distributed Monotonicity Restoration Michael Saks Rutgers University C. Seshadhri Princeton University](https://reader033.vdocuments.mx/reader033/viewer/2022052712/5a4d1bbd7f8b9ab0599d17b6/html5/thumbnails/21.jpg)
21
Classify: 1-dimensional case
d=1: Γ={1,…,n} f is a linear array.
For x in Γ, and subinterval J of Γ:violations(x,J)=|{y in J : (x,y) is a violation}|
![Page 22: 1 Distributed Monotonicity Restoration Michael Saks Rutgers University C. Seshadhri Princeton University](https://reader033.vdocuments.mx/reader033/viewer/2022052712/5a4d1bbd7f8b9ab0599d17b6/html5/thumbnails/22.jpg)
22
Constructing a large f-monotone setThe set Bad:
x in Bad if for some interval J containing x|violations(x,J)|≥|J|/2
Lemma. Good=Γ \ Bad is f-monotone |Bad| ≤ 4 ε(f)|Γ| .
So we’d like to take:Green=Good Red = Bad
![Page 23: 1 Distributed Monotonicity Restoration Michael Saks Rutgers University C. Seshadhri Princeton University](https://reader033.vdocuments.mx/reader033/viewer/2022052712/5a4d1bbd7f8b9ab0599d17b6/html5/thumbnails/23.jpg)
23
How do we compute Good?
To test whether y in Good:For each interval J containing y,
check violations(y,J)< |J|/2Difficulties
There are (n) intervals J containing y For each J, computing violations(y,J)
takes time (|J|) .
![Page 24: 1 Distributed Monotonicity Restoration Michael Saks Rutgers University C. Seshadhri Princeton University](https://reader033.vdocuments.mx/reader033/viewer/2022052712/5a4d1bbd7f8b9ab0599d17b6/html5/thumbnails/24.jpg)
24
Speeding up the computation
Estimate violations(y,J) by random sampling sample size polylog(n) is sufficient
violations* (y,J) denotes the estimate
Compute violations* (y,J) only for a carefully chosen set of test intervals
![Page 25: 1 Distributed Monotonicity Restoration Michael Saks Rutgers University C. Seshadhri Princeton University](https://reader033.vdocuments.mx/reader033/viewer/2022052712/5a4d1bbd7f8b9ab0599d17b6/html5/thumbnails/25.jpg)
25
Set of test intervals
Want set T of intervals of [n] with:
Each x is in few ( O(log n) ) intervals of T
If S is any subset of T, then S has a subfamily C such that:
each x in US belongs to 1 or 2 sets of C.
For any interval I, there is a J in T containing I,of length at most 4|I|.
![Page 26: 1 Distributed Monotonicity Restoration Michael Saks Rutgers University C. Seshadhri Princeton University](https://reader033.vdocuments.mx/reader033/viewer/2022052712/5a4d1bbd7f8b9ab0599d17b6/html5/thumbnails/26.jpg)
26
The Test Set T
Assume n=|Γ|=2k
k layers of intervalsLayer j consists of 2k-j+1-1 intervals of size 2j
![Page 27: 1 Distributed Monotonicity Restoration Michael Saks Rutgers University C. Seshadhri Princeton University](https://reader033.vdocuments.mx/reader033/viewer/2022052712/5a4d1bbd7f8b9ab0599d17b6/html5/thumbnails/27.jpg)
28
Subroutine classify
To classify y If for each J in T containing y
violations*(y,J) < .1 |J|then y is Greenelse y is Red
![Page 28: 1 Distributed Monotonicity Restoration Michael Saks Rutgers University C. Seshadhri Princeton University](https://reader033.vdocuments.mx/reader033/viewer/2022052712/5a4d1bbd7f8b9ab0599d17b6/html5/thumbnails/28.jpg)
29
Where are we?We have a subroutine Classify On input x,
Classify outputs Green or Red Runs in time polylog(n)
WHP Green is f-monotone |Red| ≤ 20ε(f)|Γ|
![Page 29: 1 Distributed Monotonicity Restoration Michael Saks Rutgers University C. Seshadhri Princeton University](https://reader033.vdocuments.mx/reader033/viewer/2022052712/5a4d1bbd7f8b9ab0599d17b6/html5/thumbnails/29.jpg)
30
Defining g(x) for Red x
The natural way to define g(x) is: Green(x) = { y : y ≤x and y Green}
g(x) = max{f(y) : y in Green(x))} = f(max{Green(x)})
In particular, this givesg(x) = f(x) for Green x
![Page 30: 1 Distributed Monotonicity Restoration Michael Saks Rutgers University C. Seshadhri Princeton University](https://reader033.vdocuments.mx/reader033/viewer/2022052712/5a4d1bbd7f8b9ab0599d17b6/html5/thumbnails/30.jpg)
32
Computing m(x)
Can search back from x to find first Green
Inefficient if x is preceded by a long Red stretch
xm(x)
![Page 31: 1 Distributed Monotonicity Restoration Michael Saks Rutgers University C. Seshadhri Princeton University](https://reader033.vdocuments.mx/reader033/viewer/2022052712/5a4d1bbd7f8b9ab0599d17b6/html5/thumbnails/31.jpg)
33
Approximating m(x)?
x
Pick random Sample(x) of points less than x Density inversely proportional to distance from x Size is polylog(n)
Green* (x) = { y: y in Sample(x) , y Green}m*(x) = max {y in Green* (x)}
m*(x)
![Page 32: 1 Distributed Monotonicity Restoration Michael Saks Rutgers University C. Seshadhri Princeton University](https://reader033.vdocuments.mx/reader033/viewer/2022052712/5a4d1bbd7f8b9ab0599d17b6/html5/thumbnails/32.jpg)
34
Is m*(x) good enough?
xm*(x) y
Suppose y is Green and m*(x) ≤ y ≤ x Since y is Green:
g(y) = f(y) and
g(x) = f(m*(x)) < f(y) = g(y)
g is not monotone
![Page 33: 1 Distributed Monotonicity Restoration Michael Saks Rutgers University C. Seshadhri Princeton University](https://reader033.vdocuments.mx/reader033/viewer/2022052712/5a4d1bbd7f8b9ab0599d17b6/html5/thumbnails/33.jpg)
35
Is m*(x) good enough?
To ensure monotonicity we need:x < y implies m*(x) < m*(y)
Requires relaxing the requirement: for all Green y, m*(y) = y
xm*(x) y
![Page 34: 1 Distributed Monotonicity Restoration Michael Saks Rutgers University C. Seshadhri Princeton University](https://reader033.vdocuments.mx/reader033/viewer/2022052712/5a4d1bbd7f8b9ab0599d17b6/html5/thumbnails/34.jpg)
36
Thinning out Green* (x)
Plan: Eliminate certain unsafe points from Green*(x)
Roughly, y is unsafe for x if for some z > x
(There is a non-trivial chance that Sample(z)has no Green points ≥ y.)
Some interval beginning with y and containing x has a high density of Reds.
xy z
![Page 35: 1 Distributed Monotonicity Restoration Michael Saks Rutgers University C. Seshadhri Princeton University](https://reader033.vdocuments.mx/reader033/viewer/2022052712/5a4d1bbd7f8b9ab0599d17b6/html5/thumbnails/35.jpg)
37
Thinning out Green* (x)
Green* (x) = { y: y in Sample(x) , y Green}m*(x) = max {y in Green* (x)}
Green^(x) = { y: y in Green* (x) , y safe for x}m^(x) = max {y in Green^ (x)}
(Hiding: Efficient implementation of Green^(x))
![Page 36: 1 Distributed Monotonicity Restoration Michael Saks Rutgers University C. Seshadhri Princeton University](https://reader033.vdocuments.mx/reader033/viewer/2022052712/5a4d1bbd7f8b9ab0599d17b6/html5/thumbnails/36.jpg)
38
Redefining Green^(x)
WHP
if x ≤ y, then m^(x) ≤ m^(y)
{x: m^(x) ≠ x} is O(ε(f) |Γ|).
![Page 37: 1 Distributed Monotonicity Restoration Michael Saks Rutgers University C. Seshadhri Princeton University](https://reader033.vdocuments.mx/reader033/viewer/2022052712/5a4d1bbd7f8b9ab0599d17b6/html5/thumbnails/37.jpg)
39
Summary of 1-dimensional case Classify points as Green and Red
Few Red points f restricted to Green is f-monotone
For each x, choose Sample(x) size polylog(n) All points less than x Density inversely proportional to distance from x
Green^ (x) from Sample(x) that are safe for x m^(x) is the maximum of Green^(x)
Output g(x)=f(m^(x))
![Page 38: 1 Distributed Monotonicity Restoration Michael Saks Rutgers University C. Seshadhri Princeton University](https://reader033.vdocuments.mx/reader033/viewer/2022052712/5a4d1bbd7f8b9ab0599d17b6/html5/thumbnails/38.jpg)
40
Dimension greater than 1
For x < y, want g(x) < g(y)
x
y
![Page 39: 1 Distributed Monotonicity Restoration Michael Saks Rutgers University C. Seshadhri Princeton University](https://reader033.vdocuments.mx/reader033/viewer/2022052712/5a4d1bbd7f8b9ab0599d17b6/html5/thumbnails/39.jpg)
41
Red/Green Classification
Extend the Red/Green classification to higher dimensions: f restricted to Green is Monotone Red is small
Straightforward (mostly) extension of 1-dimensional case
![Page 40: 1 Distributed Monotonicity Restoration Michael Saks Rutgers University C. Seshadhri Princeton University](https://reader033.vdocuments.mx/reader033/viewer/2022052712/5a4d1bbd7f8b9ab0599d17b6/html5/thumbnails/40.jpg)
42
Given Red/Green classificationIn the one-dimensional case,
Green^ (x) = sampled Green points safe for x
g(x) = f(max {y : y in Green^ (x) }
.
![Page 41: 1 Distributed Monotonicity Restoration Michael Saks Rutgers University C. Seshadhri Princeton University](https://reader033.vdocuments.mx/reader033/viewer/2022052712/5a4d1bbd7f8b9ab0599d17b6/html5/thumbnails/41.jpg)
43
The Green points below x
Set of Green maxima could be very large Sparse Random Sampling will only roughly capture the frontier Identifying the unsafe points is much harder than in the one
dimensional case
01
x
![Page 42: 1 Distributed Monotonicity Restoration Michael Saks Rutgers University C. Seshadhri Princeton University](https://reader033.vdocuments.mx/reader033/viewer/2022052712/5a4d1bbd7f8b9ab0599d17b6/html5/thumbnails/42.jpg)
44
Further work The g produced by our algorithm has
d(g,f) ≤ C(d)ε(f)|Γ| Our C(d) is exp(d2) . What should C(d) be? (Guess: C(d) = exp(d) )
Monotonicity restoration for general posets
Distributed restoration for other properties(Graph Properties?)