frontier probability days tucson, arizonamath.arizona.edu/~fpd/talks/sobieczky.pdf · frontier...
TRANSCRIPT
Approximate counting of connected components with random walks
Frontier Probability Days Tucson, Arizona
Approximate counting of connected componentswith random walks
Florian Sobieczky
Monday, 19th of May, 2014
Florian Sobieczky Approximate counting of connected components with random walks
Approximate counting of connected components with random walks
This talk is dedicated to Evi Nemeth, lost at sea.
Florian Sobieczky Approximate counting of connected components with random walks
Approximate counting of connected components with random walks
Outline
Scale Spaces
Segmentation
Number of Connected Components
Counting with Random Walks
Florian Sobieczky Approximate counting of connected components with random walks
Approximate counting of connected components with random walks
Scale Spaces
Scale Spaces
I G = 〈V = Z2,E = N.N.〉, andVN = −L/2 + 1, . . . , L2, L2 = N
I f ∈ S = η : V → R, where R represents ‘grey values’
I Φt : RV × S → S, with t ∈ (0,∞) =: ‘Scale parameter’
I V discrete (Lindeberg 94) or continuous (Witkin 83,Koenderink 84)
I Idea: Splitting up information of image into different scaleswhich label different ‘derived images’ according to differentdegree of detail (Burt 81, Crowley 81, Witkin 83)
I Typical Properties are: Causality, Linearity, Scale Invariance,Semi-group property, Isotropy, Homogeneity, ...
Florian Sobieczky Approximate counting of connected components with random walks
Approximate counting of connected components with random walks
Scale Spaces
Scale Spaces
I G = 〈V = Z2,E = N.N.〉, andVN = −L/2 + 1, . . . , L2, L2 = N
I f ∈ S = η : V → R, where R represents ‘grey values’
I Φt : RV × S → S, with t ∈ (0,∞) =: ‘Scale parameter’
I V discrete (Lindeberg 94) or continuous (Witkin 83,Koenderink 84)
I Idea: Splitting up information of image into different scaleswhich label different ‘derived images’ according to differentdegree of detail (Burt 81, Crowley 81, Witkin 83)
I Typical Properties are: Causality, Linearity, Scale Invariance,Semi-group property, Isotropy, Homogeneity, ...
Florian Sobieczky Approximate counting of connected components with random walks
Approximate counting of connected components with random walks
Scale Spaces
Scale Spaces
I G = 〈V = Z2,E = N.N.〉, andVN = −L/2 + 1, . . . , L2, L2 = N
I f ∈ S = η : V → R, where R represents ‘grey values’
I Φt : RV × S → S, with t ∈ (0,∞) =: ‘Scale parameter’
I V discrete (Lindeberg 94) or continuous (Witkin 83,Koenderink 84)
I Idea: Splitting up information of image into different scaleswhich label different ‘derived images’ according to differentdegree of detail (Burt 81, Crowley 81, Witkin 83)
I Typical Properties are: Causality, Linearity, Scale Invariance,Semi-group property, Isotropy, Homogeneity, ...
Florian Sobieczky Approximate counting of connected components with random walks
Approximate counting of connected components with random walks
Scale Spaces
Scale Spaces
I G = 〈V = Z2,E = N.N.〉, andVN = −L/2 + 1, . . . , L2, L2 = N
I f ∈ S = η : V → R, where R represents ‘grey values’
I Φt : RV × S → S, with t ∈ (0,∞) =: ‘Scale parameter’
I V discrete (Lindeberg 94) or continuous (Witkin 83,Koenderink 84)
I Idea: Splitting up information of image into different scaleswhich label different ‘derived images’ according to differentdegree of detail (Burt 81, Crowley 81, Witkin 83)
I Typical Properties are: Causality, Linearity, Scale Invariance,Semi-group property, Isotropy, Homogeneity, ...
Florian Sobieczky Approximate counting of connected components with random walks
Approximate counting of connected components with random walks
Scale Spaces
Scale Spaces
I G = 〈V = Z2,E = N.N.〉, andVN = −L/2 + 1, . . . , L2, L2 = N
I f ∈ S = η : V → R, where R represents ‘grey values’
I Φt : RV × S → S, with t ∈ (0,∞) =: ‘Scale parameter’
I V discrete (Lindeberg 94) or continuous (Witkin 83,Koenderink 84)
I Idea: Splitting up information of image into different scaleswhich label different ‘derived images’ according to differentdegree of detail (Burt 81, Crowley 81, Witkin 83)
I Typical Properties are: Causality, Linearity, Scale Invariance,Semi-group property, Isotropy, Homogeneity, ...
Florian Sobieczky Approximate counting of connected components with random walks
Approximate counting of connected components with random walks
Scale Spaces
Comparison of two scale spaces
Figure: Comparison of Φt [f ](x) := Ex [f (Bτ )] (top row; Bt is BrownianMotion) with GIMP’s ‘Selective Gaussian Blurr’, where
τ = inft > 0|∫ t
0|f (Bs)|2ds > ε. ε is a scale parameter of Φt
corresponding to the tolerance of (pathwise!) greyvalue variance. Φt
outperforms the Selective Blurr, as seen in third column (optimal case) .
Florian Sobieczky Approximate counting of connected components with random walks
Approximate counting of connected components with random walks
Segmentation
Random Walks and Image Segmentation
I Focus: Scale Spaces with Diffusions:
I Example: ‘Gaussian Scale-Space’: Φt [f ] := E[f (Bt)], t > 0
I Example: ‘Perona-Malik’ Model: Energy functional withNon-linear Diffusion as Euler-Lagrange equation
I Example: Linear Model: ‘Normalized Cuts’J. Shi, J. Malik: ‘Normalized Cuts and Image Segmentation’,IEEE PAMI, Vol. 22,20000
I Example: Grady-Model: – Labelling technique (linear, solvesBottlenecks’ problem)L.Grady, E.L. Schwarz: ‘Isoperimetric Graph partitioning forData Clustering and Image Segmentation’ PAMI, 2004
Florian Sobieczky Approximate counting of connected components with random walks
Approximate counting of connected components with random walks
Segmentation
Random Walks and Image Segmentation
I Focus: Scale Spaces with Diffusions:
I Example: ‘Gaussian Scale-Space’: Φt [f ] := E[f (Bt)], t > 0
I Example: ‘Perona-Malik’ Model: Energy functional withNon-linear Diffusion as Euler-Lagrange equation
I Example: Linear Model: ‘Normalized Cuts’J. Shi, J. Malik: ‘Normalized Cuts and Image Segmentation’,IEEE PAMI, Vol. 22,20000
I Example: Grady-Model: – Labelling technique (linear, solvesBottlenecks’ problem)L.Grady, E.L. Schwarz: ‘Isoperimetric Graph partitioning forData Clustering and Image Segmentation’ PAMI, 2004
Florian Sobieczky Approximate counting of connected components with random walks
Approximate counting of connected components with random walks
Segmentation
Random Walks and Image Segmentation
I Focus: Scale Spaces with Diffusions:
I Example: ‘Gaussian Scale-Space’: Φt [f ] := E[f (Bt)], t > 0
I Example: ‘Perona-Malik’ Model: Energy functional withNon-linear Diffusion as Euler-Lagrange equation
I Example: Linear Model: ‘Normalized Cuts’J. Shi, J. Malik: ‘Normalized Cuts and Image Segmentation’,IEEE PAMI, Vol. 22,20000
I Example: Grady-Model: – Labelling technique (linear, solvesBottlenecks’ problem)L.Grady, E.L. Schwarz: ‘Isoperimetric Graph partitioning forData Clustering and Image Segmentation’ PAMI, 2004
Florian Sobieczky Approximate counting of connected components with random walks
Approximate counting of connected components with random walks
Segmentation
Random Walks and Image Segmentation
I Focus: Scale Spaces with Diffusions:
I Example: ‘Gaussian Scale-Space’: Φt [f ] := E[f (Bt)], t > 0
I Example: ‘Perona-Malik’ Model: Energy functional withNon-linear Diffusion as Euler-Lagrange equation
I Example: Linear Model: ‘Normalized Cuts’J. Shi, J. Malik: ‘Normalized Cuts and Image Segmentation’,IEEE PAMI, Vol. 22,20000
I Example: Grady-Model: – Labelling technique (linear, solvesBottlenecks’ problem)L.Grady, E.L. Schwarz: ‘Isoperimetric Graph partitioning forData Clustering and Image Segmentation’ PAMI, 2004
Florian Sobieczky Approximate counting of connected components with random walks
Approximate counting of connected components with random walks
Segmentation
Random Walks and Image Segmentation
I Focus: Scale Spaces with Diffusions:
I Example: ‘Gaussian Scale-Space’: Φt [f ] := E[f (Bt)], t > 0
I Example: ‘Perona-Malik’ Model: Energy functional withNon-linear Diffusion as Euler-Lagrange equation
I Example: Linear Model: ‘Normalized Cuts’J. Shi, J. Malik: ‘Normalized Cuts and Image Segmentation’,IEEE PAMI, Vol. 22,20000
I Example: Grady-Model: – Labelling technique (linear, solvesBottlenecks’ problem)L.Grady, E.L. Schwarz: ‘Isoperimetric Graph partitioning forData Clustering and Image Segmentation’ PAMI, 2004
Florian Sobieczky Approximate counting of connected components with random walks
Approximate counting of connected components with random walks
Segmentation
Description of the Hoshen-Kopelman Algorithm
I Hoshen-Kopelman: Phys.Rev B 14, p 3438, 1976
I Proceed row by row: (First, only labeling)
I Does former pixel or pixel from last row have some greyvalue?I If Yes, assign same label. If No, pick new label.
I After first sweep, do second one to identify labes if samecluster.
I For Image of Order N there are O(N) steps
I Drawback: Doesn’t treat ‘almost separated clusters’ as two.
Florian Sobieczky Approximate counting of connected components with random walks
Approximate counting of connected components with random walks
Segmentation
Description of the Hoshen-Kopelman Algorithm
I Hoshen-Kopelman: Phys.Rev B 14, p 3438, 1976
I Proceed row by row: (First, only labeling)
I Does former pixel or pixel from last row have some greyvalue?I If Yes, assign same label. If No, pick new label.
I After first sweep, do second one to identify labes if samecluster.
I For Image of Order N there are O(N) steps
I Drawback: Doesn’t treat ‘almost separated clusters’ as two.
Florian Sobieczky Approximate counting of connected components with random walks
Approximate counting of connected components with random walks
Segmentation
Description of the Hoshen-Kopelman Algorithm
I Hoshen-Kopelman: Phys.Rev B 14, p 3438, 1976
I Proceed row by row: (First, only labeling)
I Does former pixel or pixel from last row have some greyvalue?I If Yes, assign same label. If No, pick new label.
I After first sweep, do second one to identify labes if samecluster.
I For Image of Order N there are O(N) steps
I Drawback: Doesn’t treat ‘almost separated clusters’ as two.
Florian Sobieczky Approximate counting of connected components with random walks
Approximate counting of connected components with random walks
Segmentation
Description of the Hoshen-Kopelman Algorithm
I Hoshen-Kopelman: Phys.Rev B 14, p 3438, 1976
I Proceed row by row: (First, only labeling)
I Does former pixel or pixel from last row have some greyvalue?I If Yes, assign same label. If No, pick new label.
I After first sweep, do second one to identify labes if samecluster.
I For Image of Order N there are O(N) steps
I Drawback: Doesn’t treat ‘almost separated clusters’ as two.
Florian Sobieczky Approximate counting of connected components with random walks
Approximate counting of connected components with random walks
Segmentation
Description of the Hoshen-Kopelman Algorithm
I Hoshen-Kopelman: Phys.Rev B 14, p 3438, 1976
I Proceed row by row: (First, only labeling)
I Does former pixel or pixel from last row have some greyvalue?I If Yes, assign same label. If No, pick new label.
I After first sweep, do second one to identify labes if samecluster.
I For Image of Order N there are O(N) steps
I Drawback: Doesn’t treat ‘almost separated clusters’ as two.
Florian Sobieczky Approximate counting of connected components with random walks
Approximate counting of connected components with random walks
Segmentation
L. Grady’s Model:
I In Grady’s Model, graph Laplacian L is set up for a weightedlattice graph, wheights ∼ exp(−a|f (x)− f (y)|), x , y ∈ V .
I Several exit points are defined (RW is ‘killed’ there),one for each Segment: ‘Boundary of the Graph’.
I Each exit carries label.
I Instead of computing the eigenvectors, for each point x ∈ Vand time t > 0 the exit measure (harmonic measure) iscomputed
I Point x obtains label of exit with highest exit measure.
I Advantage: L with boundary is invertible (RW properlysubstochastic):Solving a linear system, instead of computing eigenvectors
I Solves ’Bottleneck’ Problem.
Florian Sobieczky Approximate counting of connected components with random walks
Approximate counting of connected components with random walks
Segmentation
L. Grady’s Model:
I In Grady’s Model, graph Laplacian L is set up for a weightedlattice graph, wheights ∼ exp(−a|f (x)− f (y)|), x , y ∈ V .
I Several exit points are defined (RW is ‘killed’ there),one for each Segment: ‘Boundary of the Graph’.
I Each exit carries label.
I Instead of computing the eigenvectors, for each point x ∈ Vand time t > 0 the exit measure (harmonic measure) iscomputed
I Point x obtains label of exit with highest exit measure.
I Advantage: L with boundary is invertible (RW properlysubstochastic):Solving a linear system, instead of computing eigenvectors
I Solves ’Bottleneck’ Problem.
Florian Sobieczky Approximate counting of connected components with random walks
Approximate counting of connected components with random walks
Segmentation
L. Grady’s Model:
I In Grady’s Model, graph Laplacian L is set up for a weightedlattice graph, wheights ∼ exp(−a|f (x)− f (y)|), x , y ∈ V .
I Several exit points are defined (RW is ‘killed’ there),one for each Segment: ‘Boundary of the Graph’.
I Each exit carries label.
I Instead of computing the eigenvectors, for each point x ∈ Vand time t > 0 the exit measure (harmonic measure) iscomputed
I Point x obtains label of exit with highest exit measure.
I Advantage: L with boundary is invertible (RW properlysubstochastic):Solving a linear system, instead of computing eigenvectors
I Solves ’Bottleneck’ Problem.
Florian Sobieczky Approximate counting of connected components with random walks
Approximate counting of connected components with random walks
Segmentation
L. Grady’s Model:
I In Grady’s Model, graph Laplacian L is set up for a weightedlattice graph, wheights ∼ exp(−a|f (x)− f (y)|), x , y ∈ V .
I Several exit points are defined (RW is ‘killed’ there),one for each Segment: ‘Boundary of the Graph’.
I Each exit carries label.
I Instead of computing the eigenvectors, for each point x ∈ Vand time t > 0 the exit measure (harmonic measure) iscomputed
I Point x obtains label of exit with highest exit measure.
I Advantage: L with boundary is invertible (RW properlysubstochastic):Solving a linear system, instead of computing eigenvectors
I Solves ’Bottleneck’ Problem.
Florian Sobieczky Approximate counting of connected components with random walks
Approximate counting of connected components with random walks
Segmentation
L. Grady’s Model:
I In Grady’s Model, graph Laplacian L is set up for a weightedlattice graph, wheights ∼ exp(−a|f (x)− f (y)|), x , y ∈ V .
I Several exit points are defined (RW is ‘killed’ there),one for each Segment: ‘Boundary of the Graph’.
I Each exit carries label.
I Instead of computing the eigenvectors, for each point x ∈ Vand time t > 0 the exit measure (harmonic measure) iscomputed
I Point x obtains label of exit with highest exit measure.
I Advantage: L with boundary is invertible (RW properlysubstochastic):Solving a linear system, instead of computing eigenvectors
I Solves ’Bottleneck’ Problem.
Florian Sobieczky Approximate counting of connected components with random walks
Approximate counting of connected components with random walks
Segmentation
L. Grady’s Model:
I In Grady’s Model, graph Laplacian L is set up for a weightedlattice graph, wheights ∼ exp(−a|f (x)− f (y)|), x , y ∈ V .
I Several exit points are defined (RW is ‘killed’ there),one for each Segment: ‘Boundary of the Graph’.
I Each exit carries label.
I Instead of computing the eigenvectors, for each point x ∈ Vand time t > 0 the exit measure (harmonic measure) iscomputed
I Point x obtains label of exit with highest exit measure.
I Advantage: L with boundary is invertible (RW properlysubstochastic):Solving a linear system, instead of computing eigenvectors
I Solves ’Bottleneck’ Problem.
Florian Sobieczky Approximate counting of connected components with random walks
Approximate counting of connected components with random walks
Segmentation
L. Grady’s Model:
I In Grady’s Model, graph Laplacian L is set up for a weightedlattice graph, wheights ∼ exp(−a|f (x)− f (y)|), x , y ∈ V .
I Several exit points are defined (RW is ‘killed’ there),one for each Segment: ‘Boundary of the Graph’.
I Each exit carries label.
I Instead of computing the eigenvectors, for each point x ∈ Vand time t > 0 the exit measure (harmonic measure) iscomputed
I Point x obtains label of exit with highest exit measure.
I Advantage: L with boundary is invertible (RW properlysubstochastic):Solving a linear system, instead of computing eigenvectors
I Solves ’Bottleneck’ Problem.
Florian Sobieczky Approximate counting of connected components with random walks
Approximate counting of connected components with random walks
Segmentation
L. Grady’s Model:
I In Grady’s Model, graph Laplacian L is set up for a weightedlattice graph, wheights ∼ exp(−a|f (x)− f (y)|), x , y ∈ V .
I Several exit points are defined (RW is ‘killed’ there),one for each Segment: ‘Boundary of the Graph’.
I Each exit carries label.
I Instead of computing the eigenvectors, for each point x ∈ Vand time t > 0 the exit measure (harmonic measure) iscomputed
I Point x obtains label of exit with highest exit measure.
I Advantage: L with boundary is invertible (RW properlysubstochastic):Solving a linear system, instead of computing eigenvectors
I Solves ’Bottleneck’ Problem.
Florian Sobieczky Approximate counting of connected components with random walks
Approximate counting of connected components with random walks
Number of Connected Components
Counting the Number of Connected Components
I Bernoulli-Percolation: Number of open clusters per vertex(such as: G.R.Grimmett: ‘On the number of clusters in thepercolation model’, J.London Math.Soc. (2), 13(1076),346-350)
I Eigenvector bisection techniques: (e.g. Normalized Cuts)Problem: High computational complexity (eigenvectors) or’interactive’
I Grady’s Harmonic Measure Technique:Problem: Pre-assigned exits - ‘Interactive’ approach, ‘Exits’not initially defined
I Union-Find Algorithms (HK, etc.)Problem: Does not detect ’vague boundaries betweensegments’
Florian Sobieczky Approximate counting of connected components with random walks
Approximate counting of connected components with random walks
Number of Connected Components
Counting the Number of Connected Components
I Bernoulli-Percolation: Number of open clusters per vertex(such as: G.R.Grimmett: ‘On the number of clusters in thepercolation model’, J.London Math.Soc. (2), 13(1076),346-350)
I Eigenvector bisection techniques: (e.g. Normalized Cuts)Problem: High computational complexity (eigenvectors) or’interactive’
I Grady’s Harmonic Measure Technique:Problem: Pre-assigned exits - ‘Interactive’ approach, ‘Exits’not initially defined
I Union-Find Algorithms (HK, etc.)Problem: Does not detect ’vague boundaries betweensegments’
Florian Sobieczky Approximate counting of connected components with random walks
Approximate counting of connected components with random walks
Number of Connected Components
Counting the Number of Connected Components
I Bernoulli-Percolation: Number of open clusters per vertex(such as: G.R.Grimmett: ‘On the number of clusters in thepercolation model’, J.London Math.Soc. (2), 13(1076),346-350)
I Eigenvector bisection techniques: (e.g. Normalized Cuts)Problem: High computational complexity (eigenvectors) or’interactive’
I Grady’s Harmonic Measure Technique:Problem: Pre-assigned exits - ‘Interactive’ approach, ‘Exits’not initially defined
I Union-Find Algorithms (HK, etc.)Problem: Does not detect ’vague boundaries betweensegments’
Florian Sobieczky Approximate counting of connected components with random walks
Approximate counting of connected components with random walks
Number of Connected Components
Counting the Number of Connected Components
I Bernoulli-Percolation: Number of open clusters per vertex(such as: G.R.Grimmett: ‘On the number of clusters in thepercolation model’, J.London Math.Soc. (2), 13(1076),346-350)
I Eigenvector bisection techniques: (e.g. Normalized Cuts)Problem: High computational complexity (eigenvectors) or’interactive’
I Grady’s Harmonic Measure Technique:Problem: Pre-assigned exits - ‘Interactive’ approach, ‘Exits’not initially defined
I Union-Find Algorithms (HK, etc.)Problem: Does not detect ’vague boundaries betweensegments’
Florian Sobieczky Approximate counting of connected components with random walks
Approximate counting of connected components with random walks
Number of Connected Components
Feature detection on Satellite Images
Figure: From courtesy of J.E.Maclennan: ‘Liquid Crystallography’ (NoisyPicture)
Florian Sobieczky Approximate counting of connected components with random walks
Approximate counting of connected components with random walks
Number of Connected Components
Feature detection on Satellite Images
Figure: From courtesy of J.E.Maclennan: ‘Liquid Crystallography’ (Noiseremoval induces bottlenecks)
Florian Sobieczky Approximate counting of connected components with random walks
Approximate counting of connected components with random walks
Number of Connected Components
Feature detection on Satellite Images
Figure: Strong Noise removal induces many new bottlenecks (Still wantto find approximately same number of components)
Florian Sobieczky Approximate counting of connected components with random walks
Approximate counting of connected components with random walks
Counting with Random Walks
Counting ‘nearly’ separate components with RW’s
I General Setting: G = 〈V ,E 〉 and VN : VN ⊂ VN+1 → VLet G be infinite, transitive, amenable.Ω = 2E , µ ∈M1,+(Ω), Aut(G )-invariantω ∈ Ω and HN(ω) = 〈V , ω−1(1)〉|VN .
I Let # of Conn. Comp. of finite graph HN be MN .
I MN = Tr[∏
N,0] = limt→∞
Tr[exp(−tLN)]
where LN = IN − PN with PN delayed random walk Xt onHN and
∏N,0 the orthogonal projector into the invariant
subspace of PN .
I 1NTr[e
−tLN ] = P[Xt = X0 | X0 ∼ UNIF(V )] =: PN(t)This is the return probability if the initial distribution isuniform on VN .
Florian Sobieczky Approximate counting of connected components with random walks
Approximate counting of connected components with random walks
Counting with Random Walks
Counting ‘nearly’ separate components with RW’s
I General Setting: G = 〈V ,E 〉 and VN : VN ⊂ VN+1 → VLet G be infinite, transitive, amenable.Ω = 2E , µ ∈M1,+(Ω), Aut(G )-invariantω ∈ Ω and HN(ω) = 〈V , ω−1(1)〉|VN .
I Let # of Conn. Comp. of finite graph HN be MN .
I MN = Tr[∏
N,0] = limt→∞
Tr[exp(−tLN)]
where LN = IN − PN with PN delayed random walk Xt onHN and
∏N,0 the orthogonal projector into the invariant
subspace of PN .
I 1NTr[e
−tLN ] = P[Xt = X0 | X0 ∼ UNIF(V )] =: PN(t)This is the return probability if the initial distribution isuniform on VN .
Florian Sobieczky Approximate counting of connected components with random walks
Approximate counting of connected components with random walks
Counting with Random Walks
Counting ‘nearly’ separate components with RW’s
I General Setting: G = 〈V ,E 〉 and VN : VN ⊂ VN+1 → VLet G be infinite, transitive, amenable.Ω = 2E , µ ∈M1,+(Ω), Aut(G )-invariantω ∈ Ω and HN(ω) = 〈V , ω−1(1)〉|VN .
I Let # of Conn. Comp. of finite graph HN be MN .
I MN = Tr[∏
N,0] = limt→∞
Tr[exp(−tLN)]
where LN = IN − PN with PN delayed random walk Xt onHN and
∏N,0 the orthogonal projector into the invariant
subspace of PN .
I 1NTr[e
−tLN ] = P[Xt = X0 | X0 ∼ UNIF(V )] =: PN(t)This is the return probability if the initial distribution isuniform on VN .
Florian Sobieczky Approximate counting of connected components with random walks
Approximate counting of connected components with random walks
Counting with Random Walks
Counting ‘nearly’ separate components with RW’s
I General Setting: G = 〈V ,E 〉 and VN : VN ⊂ VN+1 → VLet G be infinite, transitive, amenable.Ω = 2E , µ ∈M1,+(Ω), Aut(G )-invariantω ∈ Ω and HN(ω) = 〈V , ω−1(1)〉|VN .
I Let # of Conn. Comp. of finite graph HN be MN .
I MN = Tr[∏
N,0] = limt→∞
Tr[exp(−tLN)]
where LN = IN − PN with PN delayed random walk Xt onHN and
∏N,0 the orthogonal projector into the invariant
subspace of PN .
I 1NTr[e
−tLN ] = P[Xt = X0 | X0 ∼ UNIF(V )] =: PN(t)This is the return probability if the initial distribution isuniform on VN .
Florian Sobieczky Approximate counting of connected components with random walks
Approximate counting of connected components with random walks
Counting with Random Walks
Counting ‘nearly’ separate components with RW’s
I Observe: For G = 〈Z2,N.N.〉, with 0 = λ0 < λ1 < · · · < Nj
the spectrum of j-th component:Shape of jth component Order of mag of λ1 in N := Nj :
Circle N−1
Dumbell (N log N)−1
I Choose N1(N) to be the minimum cardinality of a dumbbell’svertex-set, which should still be counted as two components.Let N(N) be the maximal cardinality of any connectedcomponent. Then, choose εN , c > 0 such that
c
N1 logN1< εN <
1
N.
I Ask: How many ‘nearly separated components’ are there?
Florian Sobieczky Approximate counting of connected components with random walks
Approximate counting of connected components with random walks
Counting with Random Walks
Counting ‘nearly’ separate components with RW’s
I Observe: For G = 〈Z2,N.N.〉, with 0 = λ0 < λ1 < · · · < Nj
the spectrum of j-th component:Shape of jth component Order of mag of λ1 in N := Nj :
Circle N−1
Dumbell (N log N)−1
I Choose N1(N) to be the minimum cardinality of a dumbbell’svertex-set, which should still be counted as two components.Let N(N) be the maximal cardinality of any connectedcomponent. Then, choose εN , c > 0 such that
c
N1 logN1< εN <
1
N.
I Ask: How many ‘nearly separated components’ are there?
Florian Sobieczky Approximate counting of connected components with random walks
Approximate counting of connected components with random walks
Counting with Random Walks
Counting ‘nearly’ separate components: Results
I Prop. 1 If N suff. large, M(εN)N , defined by
M(ε)N = dim(spanv ∈ RVN | (1− PN)v = λv and λ < ε)
is the number of connected or nearly separated components ifthey can only be disks, dumbbells, or touching disks with thesmaller of the two touching disks with cardinality N1 fulfilling
N1 logN1 > c · N.
I Prop. 2 M(εN)N ≥ N · PN(t), where t = ln 2
c N1 lnN1 ≥ ln 2εN
.
I Prop. 3 M(εN)N ≤ N · PN(t), where t = N ln(N2/N).
Florian Sobieczky Approximate counting of connected components with random walks
Approximate counting of connected components with random walks
Counting with Random Walks
Counting ‘nearly’ separate components: Results
I Prop. 1 If N suff. large, M(εN)N , defined by
M(ε)N = dim(spanv ∈ RVN | (1− PN)v = λv and λ < ε)
is the number of connected or nearly separated components ifthey can only be disks, dumbbells, or touching disks with thesmaller of the two touching disks with cardinality N1 fulfilling
N1 logN1 > c · N.
I Prop. 2 M(εN)N ≥ N · PN(t), where t = ln 2
c N1 lnN1 ≥ ln 2εN
.
I Prop. 3 M(εN)N ≤ N · PN(t), where t = N ln(N2/N).
Florian Sobieczky Approximate counting of connected components with random walks
Approximate counting of connected components with random walks
Counting with Random Walks
Counting ‘nearly’ separate components: Results
I Prop. 1 If N suff. large, M(εN)N , defined by
M(ε)N = dim(spanv ∈ RVN | (1− PN)v = λv and λ < ε)
is the number of connected or nearly separated components ifthey can only be disks, dumbbells, or touching disks with thesmaller of the two touching disks with cardinality N1 fulfilling
N1 logN1 > c · N.
I Prop. 2 M(εN)N ≥ N · PN(t), where t = ln 2
c N1 lnN1 ≥ ln 2εN
.
I Prop. 3 M(εN)N ≤ N · PN(t), where t = N ln(N2/N).
Florian Sobieczky Approximate counting of connected components with random walks
Approximate counting of connected components with random walks
Counting with Random Walks
Ideas from the Proof:
I Assume |VN | = N (for simplicity). Let:
EN
[f (N, λ)
]:=
1
N
∑v∈VN
f (N(v), λ(v))
where N(v) = Nj such that the j-th component with size Nj .
I Note: N =MN∑j=1
Nj . Let pj = Nj/MN∑i=1
Ni ,
I then, with EN
[f (N, λ)
]:=
MN∑j=1
pj f (Nj , λj) :
EN
[f (N, λ)
]= EN
[f (N, λ)
]the expected value with respect to the size biased distribution over the space of labels of the connected
components is the same as the expected value with respect to the uniform distribution over vertices in VN .
Florian Sobieczky Approximate counting of connected components with random walks
Approximate counting of connected components with random walks
Counting with Random Walks
Ideas from the Proof:
I Assume |VN | = N (for simplicity). Let:
EN
[f (N, λ)
]:=
1
N
∑v∈VN
f (N(v), λ(v))
where N(v) = Nj such that the j-th component with size Nj .
I Note: N =MN∑j=1
Nj . Let pj = Nj/MN∑i=1
Ni ,
I then, with EN
[f (N, λ)
]:=
MN∑j=1
pj f (Nj , λj) :
EN
[f (N, λ)
]= EN
[f (N, λ)
]the expected value with respect to the size biased distribution over the space of labels of the connected
components is the same as the expected value with respect to the uniform distribution over vertices in VN .
Florian Sobieczky Approximate counting of connected components with random walks
Approximate counting of connected components with random walks
Counting with Random Walks
Ideas from the Proof:
I Assume |VN | = N (for simplicity). Let:
EN
[f (N, λ)
]:=
1
N
∑v∈VN
f (N(v), λ(v))
where N(v) = Nj such that the j-th component with size Nj .
I Note: N =MN∑j=1
Nj . Let pj = Nj/MN∑i=1
Ni ,
I then, with EN
[f (N, λ)
]:=
MN∑j=1
pj f (Nj , λj) :
EN
[f (N, λ)
]= EN
[f (N, λ)
]the expected value with respect to the size biased distribution over the space of labels of the connected
components is the same as the expected value with respect to the uniform distribution over vertices in VN .
Florian Sobieczky Approximate counting of connected components with random walks
Approximate counting of connected components with random walks
Counting with Random Walks
Ideas from the Proof:
I Let t = t1 if PN(t) = M(ε)N ⇔: (∗). In other words, at
time t1 the average return probability equals the # ofconnected or nearly separate components.
I Note, LN = IN − PN . Let∏< = IN −
∏≥ the projector onto
span v ∈ RN | LN v = λv , and λ < ε .
I Then (∗) ⇔ Tr[∏≥ e−tLN ] = Tr[
∏<(IN − e−tLN )].
I The function on the left is decreasing in t, the one on the rightincreasing. Replacing the LHS with an upper bound and theRHS with a lower bound will lead to an equation which willonly be solved for a greater t, yielding an upper bound for t1.
Florian Sobieczky Approximate counting of connected components with random walks
Approximate counting of connected components with random walks
Counting with Random Walks
Ideas from the Proof:
I Let t = t1 if PN(t) = M(ε)N ⇔: (∗). In other words, at
time t1 the average return probability equals the # ofconnected or nearly separate components.
I Note, LN = IN − PN . Let∏< = IN −
∏≥ the projector onto
span v ∈ RN | LN v = λv , and λ < ε .
I Then (∗) ⇔ Tr[∏≥ e−tLN ] = Tr[
∏<(IN − e−tLN )].
I The function on the left is decreasing in t, the one on the rightincreasing. Replacing the LHS with an upper bound and theRHS with a lower bound will lead to an equation which willonly be solved for a greater t, yielding an upper bound for t1.
Florian Sobieczky Approximate counting of connected components with random walks
Approximate counting of connected components with random walks
Counting with Random Walks
Ideas from the Proof:
I Let t = t1 if PN(t) = M(ε)N ⇔: (∗). In other words, at
time t1 the average return probability equals the # ofconnected or nearly separate components.
I Note, LN = IN − PN . Let∏< = IN −
∏≥ the projector onto
span v ∈ RN | LN v = λv , and λ < ε .
I Then (∗) ⇔ Tr[∏≥ e−tLN ] = Tr[
∏<(IN − e−tLN )].
I The function on the left is decreasing in t, the one on the rightincreasing. Replacing the LHS with an upper bound and theRHS with a lower bound will lead to an equation which willonly be solved for a greater t, yielding an upper bound for t1.
Florian Sobieczky Approximate counting of connected components with random walks
Approximate counting of connected components with random walks
Counting with Random Walks
Ideas from the Proof:
I Let t = t1 if PN(t) = M(ε)N ⇔: (∗). In other words, at
time t1 the average return probability equals the # ofconnected or nearly separate components.
I Note, LN = IN − PN . Let∏< = IN −
∏≥ the projector onto
span v ∈ RN | LN v = λv , and λ < ε .
I Then (∗) ⇔ Tr[∏≥ e−tLN ] = Tr[
∏<(IN − e−tLN )].
I The function on the left is decreasing in t, the one on the rightincreasing. Replacing the LHS with an upper bound and theRHS with a lower bound will lead to an equation which willonly be solved for a greater t, yielding an upper bound for t1.
Florian Sobieczky Approximate counting of connected components with random walks
Approximate counting of connected components with random walks
Counting with Random Walks
Ideas from the Proof:
I The following bounds for the LHS and RHS are available:Let ε = εN := c
N ln Nfor some c > 0, and N := maxj Nj)
Bound (↓) - Side(→) LHS RHS
Upper e−tε 1− e−tε
Lower NN e−t/N 1/N
I So: ∃t>t1 e−tε = 1 − e−tε ⇒ t1 <ln 2ε ≤
ln 2c N1 lnN1
I Similarily, a lower bound for t1 can be obtained by using themonotonicity of LHS and RHS in t in the reversed way:
∃t<t1NN e−t/N = 1
N⇒ t1 > N ln(N2/N)
Florian Sobieczky Approximate counting of connected components with random walks
Approximate counting of connected components with random walks
Counting with Random Walks
Ideas from the Proof:
I The following bounds for the LHS and RHS are available:Let ε = εN := c
N ln Nfor some c > 0, and N := maxj Nj)
Bound (↓) - Side(→) LHS RHS
Upper e−tε 1− e−tε
Lower NN e−t/N 1/N
I So: ∃t>t1 e−tε = 1 − e−tε ⇒ t1 <ln 2ε ≤
ln 2c N1 lnN1
I Similarily, a lower bound for t1 can be obtained by using themonotonicity of LHS and RHS in t in the reversed way:
∃t<t1NN e−t/N = 1
N⇒ t1 > N ln(N2/N)
Florian Sobieczky Approximate counting of connected components with random walks
Approximate counting of connected components with random walks
Counting with Random Walks
Ideas from the Proof:
I The following bounds for the LHS and RHS are available:Let ε = εN := c
N ln Nfor some c > 0, and N := maxj Nj)
Bound (↓) - Side(→) LHS RHS
Upper e−tε 1− e−tε
Lower NN e−t/N 1/N
I So: ∃t>t1 e−tε = 1 − e−tε ⇒ t1 <ln 2ε ≤
ln 2c N1 lnN1
I Similarily, a lower bound for t1 can be obtained by using themonotonicity of LHS and RHS in t in the reversed way:
∃t<t1NN e−t/N = 1
N⇒ t1 > N ln(N2/N)
Florian Sobieczky Approximate counting of connected components with random walks
Approximate counting of connected components with random walks
Counting with Random Walks
Discussion
I When are these bounds tight?
I Note N1 ≤ N.
I Also: N ln(N2/N) = N(ln N − ln(N/N))Therefore, criterion for tightness:
ln(N/N) = o(ln N)
I Example: N ∼√N lnN
i.e. the largest component must be ‘slightly larger’ than√N.
Florian Sobieczky Approximate counting of connected components with random walks
Approximate counting of connected components with random walks
Counting with Random Walks
Discussion
I When are these bounds tight?
I Note N1 ≤ N.
I Also: N ln(N2/N) = N(ln N − ln(N/N))Therefore, criterion for tightness:
ln(N/N) = o(ln N)
I Example: N ∼√N lnN
i.e. the largest component must be ‘slightly larger’ than√N.
Florian Sobieczky Approximate counting of connected components with random walks
Approximate counting of connected components with random walks
Counting with Random Walks
Discussion
I When are these bounds tight?
I Note N1 ≤ N.
I Also: N ln(N2/N) = N(ln N − ln(N/N))Therefore, criterion for tightness:
ln(N/N) = o(ln N)
I Example: N ∼√N lnN
i.e. the largest component must be ‘slightly larger’ than√N.
Florian Sobieczky Approximate counting of connected components with random walks
Approximate counting of connected components with random walks
Counting with Random Walks
Discussion
I When are these bounds tight?
I Note N1 ≤ N.
I Also: N ln(N2/N) = N(ln N − ln(N/N))Therefore, criterion for tightness:
ln(N/N) = o(ln N)
I Example: N ∼√N lnN
i.e. the largest component must be ‘slightly larger’ than√N.
Florian Sobieczky Approximate counting of connected components with random walks
Approximate counting of connected components with random walks
Counting with Random Walks
Thank you for a great conference!
Figure: From Saguaro National Park. See more pictures here:http://web.cs.du.edu/~sobieczk/Tucson
Florian Sobieczky Approximate counting of connected components with random walks