some techniques in property testing

31
Some Techniques in Property Testing Dana Ron Tel Aviv University

Upload: raiden

Post on 25-Feb-2016

37 views

Category:

Documents


0 download

DESCRIPTION

Some Techniques in Property Testing. Dana Ron Tel Aviv University. ?. ?. ?. ?. ?. Task should be performed by inspecting the object (in as few places as possible). Property Testing (Informal Definition). For a fixed property P and any object O , - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Some Techniques  in Property Testing

Some Techniques in Property Testing

Dana RonTel Aviv University

Page 2: Some Techniques  in Property Testing

Property Testing (Informal Definition)

For a fixed property P and any object O,determine whether O has property P,or whether O is far from having property P (i.e., far from any other object having P ).

Task should be performed by inspecting the object (in as few places as possible).

? ?

?

??

Page 3: Some Techniques  in Property Testing

Examples

• The object can be a function and the property can be linearity.

• The object can be a string and the property can be membership in a fixed regular language LL.

• The object can be a graph and the property can be 3-colorabilty.

Page 4: Some Techniques  in Property Testing

Context

• A relaxation of exactly deciding whether the object has the property or does not have the property.

• A relaxation of learning the object (with membership queries and under the uniform distribution).

Property testing can be viewed as:

In either case want testing algorithm to be significantly more efficient than decision/learning algorithm.

Page 5: Some Techniques  in Property Testing

When can Property Testing be Useful?

• Object is to HUGE and even scanning it is infeasible so must make approximate decision.

• Object is just large but exact decision is NP-hard.

• Use Testing as preliminary step to exact decision. Namely, use testing to very quickly rule out objects that are far from having the property.

• Have poly-time exact algorithm, but approximate answer suffices so prefer sub-linear approximate algorithm.

Page 6: Some Techniques  in Property Testing

Property Testing - Background

• Initially defined by Rubinfeld and Sudan in the context of Program Testing (of algebraic functions).• With Goldreich and Goldwasser initiated study of testing properties of combinatorial objects, and in particular graphs.• Growing body of work deals with properties of functions, graphs, strings, sets of points ...

Many algorithms with complexity that is sub-linear in (or even independent of) size of object.

Page 7: Some Techniques  in Property Testing

Issues and Categories• Types of objects and properties.

self-correcting enforce & test regularity lemma testing by implicit learning testing based on invariance

• Analysis techniques.

• Algorithmic techniques.

functions (algebraic and non-algebraic properties); graphs; strings; matrices; geometric objects; sets of points;

Mostly: global sampling + local exploration

Page 8: Some Techniques  in Property Testing

The Self-Correcting Approach

Page 9: Some Techniques  in Property Testing

Linearity Testing [Blum Luby Rubinfeld]

Def1: A function f : Fn F is called linear (multi-linear) if there exist coefficients a1,…,an F s.t. f(x1,…,xn) = aixi .

Def2: A function f is said to be -far from linear if for every linear function g, dist(f,g)>, where dist(f,g)=Pr[f(x) g(x)] (x selected uniformly in Fn).

Def3: Linearity Testing Problem:Algorithm can query function on any x in Fn to obtain f(x) - if f is linear then alg should accept; - if f is -far from linear then alg should reject w.h.p.;

Fact: A function f : Fn F is linear i.f.f for every x,y Fn it holds that f(x)+f(y)=f(x+y) .

Page 10: Some Techniques  in Property Testing

Linearity Testing Cont’Linearity Testing algorithm

1) Uniformly and independently select (1/) pairs of elements x,y Fn .2) For every pair x,y selected, verify that f(x)+f(y) = f(x+y). 3) If for any of the pairs selected linearity is violated (i.e., f(x)+f(y) f(x+y)), then REJECT, otherwise ACCEPT.

Observe: If f is linear then test accepts w.p. 1.

Lemma: If f is -far from linear then with probability at least 2/3 the test rejects it.Lemma: If f is accepted with probability greater than 1/3 , then f is -close to linear.

Page 11: Some Techniques  in Property Testing

Linearity Testing Cont’Suppose f is accepted w.p > 1/3

Define self-corrected version of f, denote g: For each x,y let Vy(x) = f(x+y)-f(y) (the vote of y on x) g(x) = Plurality(Vy(x))

small (< /2) fraction of violating pairs (f(x)+f(y)f(x+y))

Can show that (conditioned on < /2 fraction of violating pairs) (1) g is linear. (2) dist(f,g)

Main Technical Lemma (informal): if few violating pairs then x we have that for almost all y, Vy(x)=g(x)

Lemma: If f is accepted with probability greater than 1/3 , then f is -close to linear.

Page 12: Some Techniques  in Property Testing

Testing Polynomials (over finite fields)Def: A function f : Fn F is a (total) degree d polynomial if there exist coefficients {av} where v=v1…vn, vi ≥ 0, vi d s.t. nv

nv

vvn xxaxxf 1

11 ),(

Different algorithms were designed to deal with different cases (e.g. d=1 [BLR], |F|>d [Rubinfeld, Sudan], F=GF(2), d>1 [Alon,Kaufman,Krivelevich,Litsyn,R]), and are analyzed using Self-correction approach.Unifying algorithm [Kaufman,R] works by restricting function to low-dimensional affine subspaces, and checking that restriction is low-deg poly (for prime fields, dimension is (d+1)/(|F|-1)).

Self correction (definition of “good” function g) works by correcting value on point based on “vote” of all subspaces it belongs to.

Page 13: Some Techniques  in Property Testing

Notes on Self-Correcting ApproachNote1: definition of self-correction function g allows to actually correct f : for every x can determine g(x) w.h.p by few queries to f.

Note2: Found useful when testing properties that correspond to subclasses of above.For example, singleton functions (f(x) = xi) are subclass of linear functions. Test for singletons [Parnas, R, Samorodnitsky] first runs linearity test. If passes, then runs additional check on self-corrected version of function.

Note3: Found useful for distribution-free testing [Halevi, Kushilevitz]: General transformation for testers under uniform dist. to dist.-free when can self-correct.

Page 14: Some Techniques  in Property Testing

The Enforce&Test Approach

Page 15: Some Techniques  in Property Testing

Testing BipartitenessDef1: Graph G=(V,E) is bipartite i.f.f. can partition vertices into two subsets V1 and V2 s.t. there are no edges between vertices that are both in V1 or both in V2.

Recall that can decide whether graph is bipartite in time O(|V|+|E|) by Breadth First Search (BFS). However, we want very fast approximate decision.

Def2: Graph G=(V,E) is -far from bipartite if every partition (V1,V2) has more than |E| violating edges.

V1 V2

Here consider dense case: |E| = (|V|2). Graph is represented by adjacency matrix, and alg can probe matrix.

Page 16: Some Techniques  in Property Testing

Testing Bipartiteness in Dense Graphs [Goldreich Goldwasser R]

• Uniformly and independently select (log(1/)/2) vertices in graph. • If subgraph induced by selected vertices is bipartite, then accept, otherwise, reject.

Query complexity and running time of algorithm: O(log2(1/)/4) . Slight variant yields O(log2(1/)/3) and [Alon, Krivelevich] reduced to O(log2(1/)/2) .

Correctness: If graph is bipartite then always accepted. Need to prove that if -far from bipartite then rejected w.h.p.

Page 17: Some Techniques  in Property Testing

High-Level idea of Analysis (When Graph is -far from bipartite)

View sample as two parts: U and S.

Idea: each partition (U1,U2) of U “enforces” a partition of all vertices.

S

Suppose every graph vertex has some neighbor in U. (In fact, w.h.p. over U holds for almost all sufficiently high degree vertices.)

Since G is -far from bipartite, partition must have many violations.

U1 U2

Will show w.h.p. in sample S (“test” sample)

Since holds for every partition (U1,U2) of U, w.h.p. do not have any bipartite partition of U and S together (induced subgraph not bipartite).

Page 18: Some Techniques  in Property Testing

Notes on Enforce&Test

Note1: Bipartite Testing algorithm and enfroce&test analysis can be generalized to testing k-colorability [GGR].

Note2: Other properties whose analysis falls under enforce&test approach: -Clique, -Cut, and other graph partition properties [GGR]; Hypergraph coloring [Czumaj, Sohler]; Tree metric properties [Parnas R]; Clustering [Alon, Dar, Parnas, R] and more.

Note3: For k-colorability, Clustering and other properties, can use output of tester to actually construct approximately good colorings/clusterings. E.g., for Bipartiteness, if graph is bipartite can determine partition that is approximately good, in constant time per vertex (has certain similarity to self correction).

Page 19: Some Techniques  in Property Testing

Testing By Implicit Learning

Page 20: Some Techniques  in Property Testing

Testing for Concise Representations [Diakonikolis, Lee, Matulef, Onak, Rubinfeld, Servedio,Wan]

Results (partial) for n-variable Boolean functions:

Decision lists Õ(1/2)s-term DNF Õ(s4/2)

size-s Decision Trees Õ(s4/2)size-s Branching Programs Õ(s4/2)

size-s Boolean formula Õ(s4/2)size-s Boolean circuits Õ(s6/2)

s-sparse polynomials over GF(2) Õ(s4/2)Functions with Fourier deg ≤ d Õ(26d/2)

Class of functions Num of queries

For all classes, poly(1/) and no dependence on n

Page 21: Some Techniques  in Property Testing

Testing for Concise Representations (cont)

Observation: many classes of functions that have concise representations (e.g., s-term DNF) can be approximated by small juntas in the class.

Example: every s-term DNF function f is -close to an s-term DNF that depends on slog(s/) variables.

Rough idea of algorithm(s): 1. Find collection of subsets of variables s.t. each contains a

single variable on which function depends (non-negligibly) (variant of junta testing [Fischer,Kindler,R,Safra,Samrodnitsky]) – if num of subsets greater than some k, rejects.

2. Based on subsets create sample of labeled examples over {0,1}k (does not identify relevant variables).

3. Check whether exists function of appropriate form over k variables that is consistent with sample.

Page 22: Some Techniques  in Property Testing

Testing for Concise Representations (cont) Rough idea of algorithm(s): 1. Finds collection of subsets of variables s.t. each contains a single

variable on which function depends (non-negligibly) – if num of subsets greater than some k, rejects.

2. Based on subsets creates sample of labeled examples over {0,1}k (does not identify relevant variables).

3. Checks whether exists function of appropriate form over k variables that is consistent with sample.

D - - D - D - - - D

1 - - 0 - 0 - - - 1 1

x1 x4 is consistent with labeled sample accept.

0 - - 1 - 0 - - - 1 0 1 - - 1 - 1 - - - 0 0

Page 23: Some Techniques  in Property Testing

Notes on Testing by Implicit Learning

Note1: technique gives rise to many positive results (also extends to non-Boolean functions)

Note2: well known that (proper) learning implies testing, but with roughly the same complexity. By using implicit learning save in complexity

Note3: running time in general is exponential in query complexity. New result for sparse polynomials over GF(2) [Diakonikolis, Lee, Matulef, Servedio,Wan] gives time- efficient algorithm.

Page 24: Some Techniques  in Property Testing

Extensions of PT:Tolerant Testing and Distance Approximation

[Parnas, Rubinfeld,R]

Tolerant Testing: Given parameters 0 ≤ 1 < 2 distinguish between being 1–close to property P and 2–far from P(“standard” testing: 1 = 0)Example: Clustering. Standard testing requires to accept only perfect clusterings (k clusters, quality (e.g., diameter) q). Tolerant testing requires to accept good clusterings (with few outliers.)

Results: clustering, monotonicity, local testing of codes, graph properties (dense and sparse models), and more.

Distance approximation: estimate distance of object from having property P.

b

Page 25: Some Techniques  in Property Testing

What Hasn’t been Covered?Lot’s of things!Important Analysis Tool for Graph Properties: Szemerdi’s regularity lemma (variants of).Used for analyzing graph properties (includes partition and forbidden subgraph properties) Alon,Fischer,Krivelevich,Szegedy]. Many other results used it since. Recently used to characterize all properties testable with no dependence on size of graph [Alon, Fischer, Newman, Shapira]

Important component for graph properties lower bounds (forbidden subgraphs): Arithmetic Progressions [Alon], [Alon, Shapira] (x3)

Tantalizing open problem: What is complexity of testing triangle-freeness (in dense-graphs model)? UB: tower of height poly(1/). LB: (roughly) exp(1/)

Page 26: Some Techniques  in Property Testing

Thanks

Page 27: Some Techniques  in Property Testing

Testing and the Regularity Lemma [Alon,Fischer,Krivelevich, Szegedy],[Alon,Shapira]*,…,

[Alon, Fischer, Newman, Shapira]

The Basis: For every , the vertices of every (sufficiently large) graph can be partitioned into t=t() subsets V1,…,Vt of equal size s.t. edge distribution between subsets Vi , Vj is roughly like in random graph with edge prob. pi,j = |E(Vi,Vj)|/|Vi||Vj| .

Results: of algorithm

Page 28: Some Techniques  in Property Testing

11 10 13 12 15 14 17 16 19 18 21 20 23 22 25 24 27 26 29 28 31 30 33 32

Last Example: Monotonicity Testing

Def: A function f : [n] R is monotone if for every i,j in [n], i< j we have f(i) ≤ f(j).It is -far from montone if must modify more than -fraction of values so that become monotone.

Observation: “Natural algorithm” (take uniform sample and check whether f is monotone on sample) does not work unless sample size = (n1/2),

Page 29: Some Techniques  in Property Testing

29 28 32333031

An alternative testing algorithm:

Repeat the following O(1/) times:1. Pick an entry uniformly at random. Let x be the value

in that entry.2. Perform a binary search for x

3. If x is found, output accept, otherwise, output reject.

X = 28

202118191617141512131011 262724252223

Monotonicity Testing Cont’

Main Claim: entries for which search succeeds define a monotonically non-decreasing sequence. Hence, If –far then must have more than –fraction entries on which search fails, causing testing to reject w.h.p.

Page 30: Some Techniques  in Property Testing

Tolerant Testing of Clustering [Parnas,R,Rubinfeld]

Tolerant Testing: Reject when -far but accept when ’-close Tolerant Testing Algorithm (input: k, ’, ) (1) Take sample of m=m(k, ’,) points from X. (2) If sample is (’ + ( - ’)/2)-close to (k,b)-clusterable then accept, o.w. reject

Sample has quadratic dependence on 1/( - ’), and same dependence on other parameters as (standard) testing algorithm.

Can analyze using a generalization of a framework by Czumaj & Sohler for (standard) testing that captures aspects of “enforce&test” approach.

Page 31: Some Techniques  in Property Testing

Directions for Further Research

“Biggest” open problem: Can we characterize what properties are efficiently testable? (e.g., find a measure analogous to VC - dimension.)

Find Families of properties that are efficiently testable. (Similarly to results for partition properties of graphs, graph properties and regular languages result.)

Extend scope of property testing.